Skip to content

Conversation

@assouktim
Copy link
Contributor

In this first part of improvement of the RAG system, we proceeded as follows:

  • Structuring the LLM response to obtain not only the answer, but also other information such as: the status of the response, the subject of the conversation, and the actual context used.
  • Modifying the default prompt for condensing questions.
  • Preparing the code for the multiple search stage.

Copy link
Member

@Benvii Benvii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, see my comments.


class ChunkSentences(BaseModel):
chunk: Optional[str] = None
sentences: Optional[List[str]] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a description of what theses sentences are, I assume it's the part of the chunk used to formulate the answer ? Or maybe it's the sentence form the answer related to this chunk ?

It's not describe in the prompt template you gave on JIRA.

class ChunkSentences(BaseModel):
chunk: Optional[str] = None
sentences: Optional[List[str]] = None
reason: Optional[str] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here, what's this reason field used for ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason why the chunk was not used (e.g., irrelevant, general background).

footnotes: set[Footnote] = Field(description='Set of footnotes')

class ChunkSentences(BaseModel):
chunk: Optional[str] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are chunk linked to their source / footnotes ? As I understand there is no link between them and we can't link them in the first implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chunks are linked to footnotes through their ID:
ChunkInfos.chunk matches Document.metadata["id"]. (i renamed ChunkSetences to ChunkInfos)
See the instruction that returns RAGResponse (on rag_chain.py).

logger.debug('RAG chain - Use chat history: %s', len(message_history.messages) > 0)
logger.debug('RAG chain - Use RAGCallbackHandler for debugging : %s', debug)

callback_handlers = get_callback_handlers(request, custom_observability_handler, debug)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it easier to return here a Tuple or an object (with 2 fields {records_callback_handler: [], observability_handler: [] }, which explicitly split the 2 kind of handlers that are instanciated in get_callback_handlers, instead of having to split them back in 2 categories RAGCallbackHandler and LangfuseCallbackHandler ?

None
)
observability_handler = next(
(x for x in callback_handlers if isinstance(x, LangfuseCallbackHandler)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning langfuse here isn't really generic if we have a arize phoenix handler in the future (which is used by SNCF Connect).

]

def get_llm_answer(rag_chain_output) -> LLMAnswer:
return LLMAnswer(**json.loads(rag_chain_output.strip().removeprefix("```json").removesuffix("```").strip()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this manual parsing required ? Langchain doesn't have any parser for that kind of markdown parsing ?

I think you can call this instead : https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser.parse

Copy link
Contributor Author

@assouktim assouktim Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not the return of LLM, but what the log handler records. It's a string that it sees.
We do use JsonOutputParser in RAG chain.

# Construct the RAG chain using the prompt and LLM,
# This chain will consume the documents retrieved by the retriever as input.
rag_chain = construct_rag_chain(question_answering_llm, rag_prompt)
if question_condensing_llm is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if question_condensing_llm is not None:
# Fallback in case of missing condensing LLM setting using the answering LLM setting.
if question_condensing_llm is not None:

{
"context": lambda x: json.dumps([
{
"chunk_id": doc.metadata['id'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to have a chunk_id here but I don't see where the LLM should reuse this chunk_id in it's reply as it's not used / present in the output type, ChunkSentences doesn't have an ID field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We explain that on prompt :

  • If explicit chunk identifiers are present in the context, use them; otherwise assign sequential numbers starting at 1.
  • For each chunk object:
    • "chunk": "<chunk_identifier_or_sequential_number>"

'question': lambda inputs: inputs[
'question'
], # Override the user's original question with the condensed one
"context": lambda x: json.dumps([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be code duplicated here, I don't see construct_rag_chain used anywhere in the code base .. how is it different from create_rag_chain ?

I see that create_rag_chain is imported in the tooling run_experiment.py script but it's unused 😬, actually it's execute_rag_chain that is used in the tooling so construct_rag_chain should be removed.

@assouktim
Copy link
Contributor Author

@Benvii It may be best not to merge immediately and instead wait until the second part. This way, we can test and evaluate all the changes at once, rather than having to repeat the tests, and impacting all prompts multiple times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants