/

News

Retrieval Augmented Generation (RAG) for local and secure document retrieval: a pipeline for data standardisation and semantic scoring

The increasing digitisation of information processes has made text data a central yet complex resource to be exploited. Heterogeneous documents such as CVs, reports and e-mails, characterised by high linguistic and semantic variability, still pose considerable challenges to automated systems in terms of data extraction, semantic comparison and integration into decision-making processes.

The new study AI4Cyber analyses an approach that employs Large Language Models (LLM) for the semantic structuring of information. It proposes a pipeline capable of transforming unstructured documents into formal representations and introduces semantic comparison and scoring mechanisms to assess the relevance of information, ensuring control, transparency and reproducibility in local and secure environments.

Looking forward, this analysis proposes the evolution towards models with more advanced reasoning capabilities and adaptive processes supported by human feedback, strengthening the role of LLM in building structured and reliable knowledge bases.

If you wish to learn more, here is the link to our comprehensive study.

In addition, you can subscribe to the specific mailing list Cyber Studios by Tinexta Defence, to receive updates on upcoming research:

https://tinextadefence.it/mailing-list-cyber-studios/

IT

/

EN

IT

/

EN

News

Retrieval Augmented Generation (RAG) for local and secure document retrieval: a pipeline for data standardisation and semantic scoring

Share:

Categories

Recent articles

Cybersecurity and development: discover the new entries of Tinexta Defence

An opportunity for dialogue between companies and young talent in the Aerospace & Defence sector

Related articles

Cybersecurity and development: discover the new entries of Tinexta Defence

An opportunity for dialogue between companies and young talent in the Aerospace & Defence sector

Information day on the IRIDE space programme at PoliMI

Contact us

Next | Donexit | Foramil | Innodesi