Newsletter, Documents & Case Studies

NEWSLETTER, DOCUMENTS & CASE STUDIES

Previous Newsletters

Click the links below to view or download our previous newsletters:

Documents

Click the links below to view or download the contributions from each partner:

GALICIA PRELIMINARY SURVEY REPORT – 08/01/25
As part of the GALICIA project, a consultation was launched to gather insights from a diverse range of stakeholders, including users of innovative SMEs and Start-ups, Research Centers, Large companies, Industry Associations, and Innovation Hubs, both in Europe and Canada. The objective was to understand their experiences and perspectives on the innovation prospects of the project.

The consultation started at the end of October 2024 and will conclude at the end of February 2025. This report presents the partial results collected up to January 8, 2025. The final survey report will be shared in March 2025.
GALICIA KPI4 – Platform logical architecture and requirements
The KPI4 public report “Platform logical architecture and requirements” details the platform’s architecture, specifying its key components and their features, alongside the software delivery requirements. In particular, this document outlines an application that generates and verifies source code from natural language requirements.

It features a user-friendly interface, a server that orchestrates interactions with Large Language Models (LLMs) like LLaMA and GPT-4o, and formal verification tools like NuSMV or PyModel.
GALICIA KPI6 – Platform Delivery
The platform implements the logical architecture documented in KPI4, outlining how formal models, natural language inputs, and code generation interact. Its working version was tested across three major use cases (IoT, geometry, and TCP/IP). A first evalution by independent panel highlighted both its strengths and its potential improvement areas, which were subsequently partly implemented. Now a second evaluation is ongoing, encompassing further functional testing and the development of a comparison metrics.
At this link is possible to test the Platform – upon registration.
GALICIA KPI7 –Terms of Reference of the mid-term Workshop, contents and questionnaires
This report is structured around two key documents:
1. a Report on a Survey conducted in advance with GALICIA stakeholders;
2. the Terms of Reference (ToR) for the workshop celebrated on March 4, 2025.
The survey was carried out to gather insights into the challenges and opportunities in verifying AI-generated code, as well as stakeholder perspectives on automated testing and compliance. Its outcomes played a crucial role in shaping and establishing the ToR of the workshop, ensuring that its scope, objectives, and key discussion themes align with the most relevant issues identified.The ToR now serves as a strategic framework to guide stakeholder engagement, facilitating discussions on cybersecurity, AI-driven automation, and compliance verification in a way that directly addresses the survey’s findings. The workshop serves as a platform to refine GALICIA’s approach, incorporating feedback from industry and research communities to ensure that the project remains aligned with emerging challenges in digital resilience.
GALICIA KPI9 – Proceedings of the mid-term GALICIA workshop
This document presents the proceedings of the KPI9 Workshop which aimed to foster collaboration, share innovative methodologies, and establish common standards for measuring performance across different systems and projects. The following pages summarize the key presentations, discussions, and outcomes of the event, offering valuable insights into the current trends and future directions in the GALICIA development.
GALICIA KPI10 – Platform evaluation results
This report presents the results of a usability evaluation of the GALICIA platform, a tool developed to generate and validate source code using large language models (LLMs). Conducted as part of the GALICIA project, the evaluation involved external experts who assessed the platform through the System Usability Scale (SUS) methodology.The document outlines the evaluation process, summarizes user feedback, and identifies both strengths and areas requiring improvement. The insights gathered aim to guide future development and enhance the platform’s functionality, reliability, and user experience.
GALICIA KPI12 – Term Of Reference of the GALICIA Final Workshop
In April 2025, the GALICIA project reached a key milestone with its final workshop, held during the prestigious World Summit AI in Montreal, Canada. Organized by Mind in a Box, the event presented the project’s innovative platform for automated verification of AI-generated code to an international audience of experts and stakeholders. This report outlines the objectives, format, and outcomes of the workshop, highlighting its role in strengthening trust in Generative AI within industrial and telecom applications, and setting the stage for future collaboration and adoption.
GALICIA KPI13 – Platform Evaluation and Verification
This document presents the results of the second evaluation of the GALICIA platform, conducted in the final stages of the project. Building on feedback from a prior assessment, this evaluation aimed to measure usability improvements, validate recent updates, and assess the platform’s practical performance through hands-on testing and expert consultation. The findings reflect both progress achieved and areas requiring further development, offering key insights to guide future enhancements.
Mind in a Box presents GALICIA Infrastructure and RAG Workflow at ASPAI 2025
Mind in a Box, partner of GALICIA, participated in the 7^th International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI 2025), held in Austria from April 8 to 10, 2025. The event brought together leading experts, researchers, and innovators in the field of AI, providing a high-level platform for knowledge exchange and technological dissemination.

As part of its ongoing efforts within the GALICIA project, Mind in a Box used this international stage as a key dissemination opportunity to present the infrastructure supporting GALICIA’s advanced AI capabilities. The presentation focused on two major components:
1. LLMOps (Large Language Model Operations)
2. Context-aware Retrieval-Augmented Generation (RAG) Workflow.
During the event, a scientific paper titled “Enhancing Real-time Decision-making with Scalable, Safe, and Private LLMOps and Context-aware RAG Workflows” was officially presented and published in the conference proceedings (see p. 120 of the attached document).

This study explores the integration of scalable Large Language Model Operations (LLMOps) with advanced Retrieval-Augmented Generation (RAG) workflows to address challenges in deploying high-performance AI systems. It emphasizes retrieval’s role as a critical component in RAG, ensuring relevance and accuracy. The architecture achieves enhanced relevance, efficiency, and scalability using technologies like vLLM for low-latency inference, Nvidia A30 GPUs for accelerated processing, and OpenSearch for hybrid search. Components like hybrid search, reranking, HyDE, and domain-specific embedding adapters optimize retrieval and generation processes. Kubernetes and Docker facilitate dynamic scaling and resource management, while on-premises deployment prioritizes data privacy. The SciFact dataset is used to

evaluate the system’s retrieval and generation performance, with metrics like NDCG and MAP assessing effectiveness. The study highlights incremental improvements from enhanced RAG features and test scalability under high query loads, demonstrating a robust, efficient solution for sensitive, high-stakes applications.

This dissemination activity significantly contributed to raising awareness about the GALICIA project among the international AI research community, while also strengthening the scientific credibility of its technical approach.

BRIDGING FORMAL METHODS AND GENERATIVE AI:

Insights from the GALICIA Case Studies

Reimagining Code Verification in the AI Era

In a rapidly evolving technological landscape, the challenge of ensuring software correctness remains more relevant than ever. The GALICIA project proposes a novel approach that integrates formal verification methods with generative AI models, specifically large language models (LLMs) like LLAMA and GPT. The core goal is to verify that code generated by AI complies with functional and security standards, and aligns with user-defined requirements.

The methodology aims to:

Build trust in AI-generated code;
Validate code functionality across any programming language;
Prove that a model satisfies both design-time properties (verification) and user-requested outcomes (validation).

Core Methodology: From Natural Language to Verified Code

Development Workflow Overview

GALICIA introduces a structured, iterative workflow that bridges natural language functional requirements (NLFRs) with formal methods:

Formal Model Creation: Natural language requirements are translated into formal specifications.
Code Generation: LLMs generate source code based on these formal models.
Compliance Verification: Techniques such as model-checking are used to verify that generated code aligns with both functional and security standards.

This process is iterative and includes:

Refinement cycles, where code is revised based on verification feedback;
Code verification against test cases, ensuring it performs correctly also in boundary cases.

Reimagining Code Verification in the AI Era

To demonstrate the methodology, GALICIA tested various known problems:

Fibonacci sequence and its recursive implementation
Euclidean geometry, including formalization of axioms and logical proof generation
WiC Advisor CRUD operations, simulating realistic IoT service interactions
TCP/IP protocol, specifically modeling the three-way handshake

These examples span mathematical constructs, protocol behavior, and web service logic, providing a comprehensive validation field.

WiC Advisor Functional Coverage

This case study tested the GALICIA workflow on an IoT-focused application: the WiC Advisor. Key features:

CRUD operations: create, read, update, delete customer records.
Address verification: entering a user address like “Piazza Garibaldi 6, Milan”, then querying external operators (TIM, Open Fiber, Fastweb) to retrieve internet service data (e.g., speed, price).
Quote creation wizard: guides users in product selection and pricing.
Ticketing system: allows users to request technical support and track outcomes.

Each function was tested for:

Semantic completeness (does it cover all expected user actions?);
Data integrity (are external APIs handled securely?);
Error tolerance (can it gracefully handle malformed inputs or downtime?).

Formalizing Euclid’s Geometry

Euclid’s geometry was chosen due to its clear axiomatic structure, making it ideal for formal modeling. The workflow:

Extracted axioms and definitions, e.g.:
- “A point is that which has no part”
- “It is possible to draw a straight line from any point to any point”
Translated into formal logic:
- Example in first-order logic:
  ∀A,B ∃L (Line(L) ∧ PassesThrough(L,A) ∧ PassesThrough(L,B))
Implemented in multiple languages:
- Prolog: defined geometric constructs and logic predicates (e.g., distance).
- Coq: proved geometric properties via theorem proving.
- Python: visualized or simulated geometric constructions.
Mathematical Proof for Constructing an Equilateral Triangle, built in Prolog:

- This way we represent points as coordinates and then define relations between them, like distance and equality

The model was also built in Coq and Python.

This case exemplified how well-defined domains can benefit from multi-modal formal modeling and cross-language execution.

TCP/IP Protocol Verification

The third case study modeled the TCP three-way handshake.

Focus on the TCP three-way handshake:

an Event B model generated by GPT;
Its verification in a Python implementation.

The sequence ensures reliable communication between sender and receiver

typical sequence:

1. the Sender sends a message to the Receiver;
2. Upon receiving the message, the Receiver validates its integrity (e.g., checksum);
3. If the message is valid, the Receiver sends an Acknowledgment (ACK) to the Sender;
4. The Sender:

- - Waits for the ACK within a predefined timeout;
  - If the ACK is received, considers the message successfully transmitted;
  - If no ACK is received (timeout), retransmits the message;

1. This process repeats until the message is successfully acknowledged or a maximum retry limit is reached.

Despite being a toy example, it demonstrated how even network protocols — typically modeled with finite state machines — can be formalized and verified through GALICIA’s approach.

Modeling the TCP Three-Way Handshake

In this case study, the goal was to model and verify the well-known TCP/IP handshake using a combination of natural language specifications, formal modeling, and generative AI code synthesis.

The communication sequence modeled includes:

The Sender transmits a message (SYN).
The Receiver checks message integrity (e.g., checksum).
If valid, the Receiver sends an acknowledgment (ACK).
The Sender waits for the ACK within a defined timeout.
If no ACK is received, the message is retransmitted.

This behavior ensures reliable data transmission, even in the presence of network faults or delays.

Formal Model Creation

The Event B model generated by GPT in a co-creative process:

the LLM provides extensive user support;
the Event B formalism suggested by chatGPT based on the problem features.

Model Verification

The Model is implemented in Python:

extensively used by developers, as testified in our stakeholder survey;
very similar to Event B;
events marked by printouts.

A Formal Verification Approach to Basic Network Port Scanning

This document presents the development and validation of a basic port scanner implemented in PHP. The scanner is designed to test a predefined list of common ports on a specified IP address, returning a simple text output indicating whether each port is “Open” or “Closed”.

The PHP implementation uses an array to store the common ports, and assigns a status variable to each element of the array. This allows the program to track and report the state (open or closed) of each port individually.

Toward a Co-creative Future of Software Development

The GALICIA project proves that building formal models from NL specifications is feasible in days — compared to months for manual methods. Generative AI, when guided correctly, helps:

Select the appropriate formal modeling paradigm;
Ensure completeness and consistency in the model;
And generate multi-language implementations from a shared logic base.

The process evolves into a co-creative collaboration between human experts and AI tools. While current examples are small, they lay the groundwork for scaling this method to more complex systems — a direction where trustworthy AI-generated software becomes not only possible, but standard.