Hey everyone.
I have some pdf-documents containing information. Is there a LLM/AI I could feed this into, and then ask questions, and the AI would answer with the information from the document, ideally with where to find it in there?
I have heard of tools called “chatPDF” (the name seems so clever that multiple sites use it), which seem to be doing exactly that. Unfortunately I don’t want to upload these documents to some random website.
So I either need something like this that I could host myself (is something like that even feasible?), or an online service with good privacy/something where I could trust that the documents would not be used to train, or gather statistics etc.
Any help or pointers would be greatly appreciated!

  • e0qdk
    link
    fedilink
    1
    edit-2
    1 year ago

    So I either need something like this that I could host myself (is something like that even feasible?)

    The closest thing I could find that already exists is GPT4All Chat with LocalDocs Plugin. That basically builds a DB of snippets from your documents and then tries to pick relevant stuff based on your query to provide additional input as part of your prompt to a local LLM. There are details about what it can and can’t do further down the page. I have not tested this one myself, but this is something you could experiment with.

    Another idea – if you want to get more into engineering custom tools – would be to split a document (or documents) you want to interact with into multiple overlapping chunks that fit within the context window (assuming you can get the relevant content out – PyPDF2’s documentation explains why this can be difficult), and then prompt with something like "Does this text contain anything that answers ? ". (May take some experimentation to figure out how to engineer the prompt well.) You could repeat that for each chunk gathering snippets and then do a second pass over all snippets asking the LLM to summarize and/or rate the quality of its own answers (or however you want to combine results).

    Basically you would need to give it two prompts: a prompt for the “map” phase that you use to apply to every snippet to try to extract relevant info from each snippet, and a second prompt for the “reduce” phase that combines two answers (which is then chained).

    i.e.:

    f(a) + f(b) + f(c) + ... + f(z)
    
    

    where f(a) is the result of the first extraction on snippet a and + means “combine these two snippets using the second prompt”. (You can evaluate in whatever order you feel is appropriate – including in parallel, if you have enough compute power for that.)

    If you have enough context space for it, you could include a summary of the previous state of the conversation as part of the prompts in order to get something like an actual conversation with the document going.

    No idea how well that would work in practice (probably very slow!), but it might be fun to experiment with.