[DERCBOT-1609] Improving RAG System - Part 1 #1913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

assouktim wants to merge 1 commit into theopenconversationkit:master from CreditMutuelArkea:feature/dercbot-1609

Contributor

assouktim commented Sep 4, 2025

In this first part of improvement of the RAG system, we proceeded as follows:

Structuring the LLM response to obtain not only the answer, but also other information such as: the status of the response, the subject of the conversation, and the actual context used.
Modifying the default prompt for condensing questions.
Preparing the code for the multiple search stage.

Benvii requested changes

View reviewed changes

Member

Benvii left a comment

Thanks for this PR, see my comments.

bot/admin/web/src/app/rag/rag-settings/models/engines-configurations.ts Show resolved Hide resolved

.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/rag/rag_models.py Outdated

    
              class ChunkSentences(BaseModel):

                  chunk: Optional[str] = None

                  sentences: Optional[List[str]] = None

Member

Benvii Sep 8, 2025

Add a description of what theses sentences are, I assume it's the part of the chunk used to formulate the answer ? Or maybe it's the sentence form the answer related to this chunk ?

It's not describe in the prompt template you gave on JIRA.

.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/rag/rag_models.py Outdated

    
              class ChunkSentences(BaseModel):

                  chunk: Optional[str] = None

                  sentences: Optional[List[str]] = None

                  reason: Optional[str] = None

Member

Benvii Sep 8, 2025

Same question here, what's this reason field used for ?

Contributor Author

assouktim Sep 12, 2025

Reason why the chunk was not used (e.g., irrelevant, general background).

.../orchestrator-server/src/main/python/server/src/gen_ai_orchestrator/models/rag/rag_models.py Outdated

    
                  footnotes: set[Footnote] = Field(description='Set of footnotes')

              class ChunkSentences(BaseModel):

                  chunk: Optional[str] = None

Member

Benvii Sep 8, 2025

How are chunk linked to their source / footnotes ? As I understand there is no link between them and we can't link them in the first implementation.

Contributor Author

assouktim Sep 12, 2025

Chunks are linked to footnotes through their ID:
ChunkInfos.chunk matches Document.metadata["id"]. (i renamed ChunkSetences to ChunkInfos)
See the instruction that returns RAGResponse (on rag_chain.py).

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

    
                  logger.debug('RAG chain - Use chat history: %s', len(message_history.messages) > 0)

                  logger.debug('RAG chain - Use RAGCallbackHandler for debugging : %s', debug)

                  callback_handlers = get_callback_handlers(request, custom_observability_handler, debug)

Member

Benvii Sep 8, 2025

Isn't it easier to return here a Tuple or an object (with 2 fields {records_callback_handler: [], observability_handler: [] }, which explicitly split the 2 kind of handlers that are instanciated in get_callback_handlers, instead of having to split them back in 2 categories RAGCallbackHandler and LangfuseCallbackHandler ?

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

    
                          None

                      )

                  observability_handler = next(

                      (x for x in callback_handlers if isinstance(x, LangfuseCallbackHandler)),

Member

Benvii Sep 8, 2025

Mentioning langfuse here isn't really generic if we have a arize phoenix handler in the future (which is used by SNCF Connect).

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

    
                  ]

              def get_llm_answer(rag_chain_output) -> LLMAnswer:

                  return LLMAnswer(**json.loads(rag_chain_output.strip().removeprefix("```json").removesuffix("```").strip()))

Member

Benvii Sep 8, 2025

Is this manual parsing required ? Langchain doesn't have any parser for that kind of markdown parsing ?

I think you can call this instead : https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser.parse

Contributor Author

assouktim Sep 12, 2025 •

edited

Loading

No, it's not the return of LLM, but what the log handler records. It's a string that it sees.
We do use JsonOutputParser in RAG chain.

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

    
                  # Construct the RAG chain using the prompt and LLM,

                  # This chain will consume the documents retrieved by the retriever as input.

                  rag_chain = construct_rag_chain(question_answering_llm, rag_prompt)

                  if question_condensing_llm is not None:

Member

Benvii Sep 8, 2025

Suggested change

      
                if question_condensing_llm is not None:
          
                # Fallback in case of missing condensing LLM setting using the answering LLM setting.
          
                if question_condensing_llm is not None:

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py

    
                          {

                              "context": lambda x: json.dumps([

                                  {

                                      "chunk_id": doc.metadata['id'],

Member

Benvii Sep 8, 2025

Great to have a chunk_id here but I don't see where the LLM should reuse this chunk_id in it's reply as it's not used / present in the output type, ChunkSentences doesn't have an ID field.

Contributor Author

assouktim Sep 12, 2025

We explain that on prompt :

If explicit chunk identifiers are present in the context, use them; otherwise assign sequential numbers starting at 1.
For each chunk object:
- "chunk": "<chunk_identifier_or_sequential_number>"

...trator-server/src/main/python/server/src/gen_ai_orchestrator/services/langchain/rag_chain.py Outdated

    
                          'question': lambda inputs: inputs[

                              'question'

                          ],  # Override the user's original question with the condensed one

                          "context": lambda x: json.dumps([

Member

Benvii Sep 8, 2025

It seems to be code duplicated here, I don't see construct_rag_chain used anywhere in the code base .. how is it different from create_rag_chain ?

I see that create_rag_chain is imported in the tooling run_experiment.py script but it's unused 😬, actually it's execute_rag_chain that is used in the tooling so construct_rag_chain should be removed.

assouktim force-pushed the feature/dercbot-1609 branch from dc932ab to 534843e Compare

September 12, 2025 16:02


          [DERCBOT-1609] Structuring the LLM response

6004fc4

assouktim force-pushed the feature/dercbot-1609 branch from 11e255a to 6004fc4 Compare

September 15, 2025 09:16

Contributor Author

assouktim commented Sep 15, 2025

@Benvii It may be best not to merge immediately and instead wait until the second part. This way, we can test and evaluate all the changes at once, rather than having to repeat the tests, and impacting all prompts multiple times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet