NEWSLETTER & DOCUMENTS
& CASE STUDIES

Previous Newsletters

Click the links below to view or download our previous newsletters:

Documents

Click the links below to view or download the contributions from each partner:

  • GALICIA PRELIMINARY SURVEY REPORT – 08/01/25

    As part of the GALICIA project, a consultation was launched to gather insights from a diverse range of stakeholders, including users of innovative SMEs and Start-ups, Research Centers, Large companies, Industry Associations, and Innovation Hubs, both in Europe and Canada. The objective was to understand their experiences and perspectives on the innovation prospects of the project.

    The consultation started at the end of October 2024 and will conclude at the end of February 2025. This report presents the partial results collected up to January 8, 2025. The final survey report will be shared in March 2025.

  • GALICIA KPI4 – Platform logical architecture and requirements

    The KPI4 public report “Platform logical architecture and requirements” details the platform’s architecture, specifying its key components and their features, alongside the software delivery requirements. In particular, this document outlines an application that generates and verifies source code from natural language requirements.

    It features a user-friendly interface, a server that orchestrates interactions with Large Language Models (LLMs) like LLaMA and GPT-4o, and formal verification tools like NuSMV or PyModel.

  • GALICIA KPI6 – Platform Delivery
    The platform implements the logical architecture documented in KPI4, outlining how formal models, natural language inputs, and code generation interact. Its working version was tested across three major use cases (IoT, geometry, and TCP/IP). A first evalution by independent panel highlighted both its strengths and its potential improvement areas, which were subsequently partly implemented. Now a second evaluation is ongoing, encompassing further functional testing and the development of a comparison metrics.
    At this link is possible to test the Platform – upon registration.
    Here you can find the evaluation questionnaire.

  • GALICIA KPI7 –Terms of Reference of the mid-term Workshop, contents and questionnaires
    This report is structured around two key documents:
    1. a Report on a Survey conducted in advance with GALICIA stakeholders;
    2. the Terms of Reference (ToR) for the workshop celebrated on March 4, 2025.

    The survey was carried out to gather insights into the challenges and opportunities in verifying AI-generated code, as well as stakeholder perspectives on automated testing and compliance. Its outcomes played a crucial role in shaping and establishing the ToR of the workshop, ensuring that its scope, objectives, and key discussion themes align with the most relevant issues identified.The ToR now serves as a strategic framework to guide stakeholder engagement, facilitating discussions on cybersecurity, AI-driven automation, and compliance verification in a way that directly addresses the survey’s findings. The workshop serves as a platform to refine GALICIA’s approach, incorporating feedback from industry and research communities to ensure that the project remains aligned with emerging challenges in digital resilience.

  • GALICIA KPI9 – Proceedings of the mid-term GALICIA workshop
    This document presents the proceedings of the KPI9 Workshop which aimed to foster collaboration, share innovative methodologies, and establish common standards for measuring performance across different systems and projects. The following pages summarize the key presentations, discussions, and outcomes of the event, offering valuable insights into the current trends and future directions in the GALICIA development.

  • GALICIA KPI10 – Platform evaluation results
    This report presents the results of a usability evaluation of the GALICIA platform, a tool developed to generate and validate source code using large language models (LLMs). Conducted as part of the GALICIA project, the evaluation involved external experts who assessed the platform through the System Usability Scale (SUS) methodology.The document outlines the evaluation process, summarizes user feedback, and identifies both strengths and areas requiring improvement. The insights gathered aim to guide future development and enhance the platform’s functionality, reliability, and user experience.

BRIDGING FORMAL METHODS AND GENERATIVE AI:

Insights from the GALICIA Case Studies

Reimagining Code Verification in the AI Era

In a rapidly evolving technological landscape, the challenge of ensuring software correctness remains more relevant than ever. The GALICIA project proposes a novel approach that integrates formal verification methods with generative AI models, specifically large language models (LLMs) like LLAMA and GPT. The core goal is to verify that code generated by AI complies with functional and security standards, and aligns with user-defined requirements.

The methodology aims to:

  • Build trust in AI-generated code;
  • Validate code functionality across any programming language;
  • Prove that a model satisfies both design-time properties (verification) and user-requested outcomes (validation).

Core Methodology: From Natural Language to Verified Code

Development Workflow Overview

GALICIA introduces a structured, iterative workflow that bridges natural language functional requirements (NLFRs) with formal methods:

  • Formal Model Creation: Natural language requirements are translated into formal specifications.
  • Code Generation: LLMs generate source code based on these formal models.
  • Compliance Verification: Techniques such as model-checking are used to verify that generated code aligns with both functional and security standards.

This process is iterative and includes:

  • Refinement cycles, where code is revised based on verification feedback;
  • Code verification against test cases, ensuring it performs correctly also in boundary cases.

Reimagining Code Verification in the AI Era

To demonstrate the methodology, GALICIA tested various known problems:

  • Fibonacci sequence and its recursive implementation
  • Euclidean geometry, including formalization of axioms and logical proof generation
  • WiC Advisor CRUD operations, simulating realistic IoT service interactions
  • TCP/IP protocol, specifically modeling the three-way handshake


These examples span mathematical constructs, protocol behavior, and web service logic, providing a comprehensive validation field.

WiC Advisor Functional Coverage

This case study tested the GALICIA workflow on an IoT-focused application: the WiC Advisor. Key features:

  • CRUD operations: create, read, update, delete customer records.
  • Address verification: entering a user address like “Piazza Garibaldi 6, Milan”, then querying external operators (TIM, Open Fiber, Fastweb) to retrieve internet service data (e.g., speed, price).
  • Quote creation wizard: guides users in product selection and pricing.
  • Ticketing system: allows users to request technical support and track outcomes.

Each function was tested for:

  • Semantic completeness (does it cover all expected user actions?);
  • Data integrity (are external APIs handled securely?);
  • Error tolerance (can it gracefully handle malformed inputs or downtime?).

Euclid’s geometry was chosen due to its clear axiomatic structure, making it ideal for formal modeling. The workflow:

  1. Extracted axioms and definitions, e.g.:
    • “A point is that which has no part”
    • “It is possible to draw a straight line from any point to any point”
  2. Translated into formal logic:
    • Example in first-order logic:
      ∀A,B ∃L (Line(L) ∧ PassesThrough(L,A) ∧ PassesThrough(L,B))
  3. Implemented in multiple languages:
    • Prolog: defined geometric constructs and logic predicates (e.g., distance).
    • Coq: proved geometric properties via theorem proving.
    • Python: visualized or simulated geometric constructions.
  4. Mathematical Proof for Constructing an Equilateral Triangle, built in Prolog:
    • This way we represent points as coordinates and then define relations between them, like distance and equality


The model was also built in Coq and Python.

This case exemplified how well-defined domains can benefit from multi-modal formal modeling and cross-language execution.

The third case study modeled the TCP three-way handshake.

Focus on the TCP three-way handshake:

  • an Event B model generated by GPT;
  • Its verification in a Python implementation.

The sequence ensures reliable communication between sender and receiver

  • typical sequence:
    1. the Sender sends a message to the Receiver;
    2. Upon receiving the message, the Receiver validates its integrity (e.g., checksum);
    3. If the message is valid, the Receiver sends an Acknowledgment (ACK) to the Sender;
    4. The Sender:
      • Waits for the ACK within a predefined timeout;
      • If the ACK is received, considers the message successfully transmitted;
      • If no ACK is received (timeout), retransmits the message;
    1. This process repeats until the message is successfully acknowledged or a maximum retry limit is reached.


Despite being a toy example, it demonstrated how even network protocols — typically modeled with finite state machines — can be formalized and verified through GALICIA’s approach.

Modeling the TCP Three-Way Handshake

In this case study, the goal was to model and verify the well-known TCP/IP handshake using a combination of natural language specifications, formal modeling, and generative AI code synthesis.

The communication sequence modeled includes:

  1. The Sender transmits a message (SYN).
  2. The Receiver checks message integrity (e.g., checksum).
  3. If valid, the Receiver sends an acknowledgment (ACK).
  4. The Sender waits for the ACK within a defined timeout.
  5. If no ACK is received, the message is retransmitted.


This behavior ensures reliable data transmission, even in the presence of network faults or delays.

Formal Model Creation

The Event B model generated by GPT in a co-creative process:

  • the LLM provides extensive user support;
  • the Event B formalism suggested by chatGPT based on the problem features.

Model Verification

The Model is implemented in Python:

  • extensively used by developers, as testified in our stakeholder survey;
  • very similar to Event B;
  • events marked by printouts.

Toward a Co-creative Future of Software Development

The GALICIA project proves that building formal models from NL specifications is feasible in days — compared to months for manual methods. Generative AI, when guided correctly, helps:

  • Select the appropriate formal modeling paradigm;
  • Ensure completeness and consistency in the model;
  • And generate multi-language implementations from a shared logic base.


The process evolves into a co-creative collaboration between human experts and AI tools. While current examples are small, they lay the groundwork for scaling this method to more complex systems — a direction where trustworthy AI-generated software becomes not only possible, but standard.