The application is called Chat with RTX, and it lets users modify a GenAI model similar to ChatGPT from OpenAI by linking it to files, documents, and notes that it may query.
“Rather than searching through notes or saved content, users can simply type queries, for example, one could ask, ‘What was the restaurant my partner recommended while in Las Vegas?’ and Chat with RTX will scan local files the user points it to and provide the answer with context.”
Nvidia
Chat with RTX supports several text-based models, such as Meta’s Llama 2, but by default uses the open-source model from AI company Mistral. Nvidia cautions that depending on the model(s) chosen, downloading all required data could use up to 50GB to 100GB of storage.
Chat with RTX now supports the following file formats: text, PDF, doc, docx, and xml. Any supported files can be loaded into the model’s fine-tuning dataset by pointing the program at a folder containing those files. Furthermore, Chat with RTX can load transcriptions of the videos in a YouTube playlist using its URL, allowing the selected model to query its contents.
However, there are some restrictions to be aware of, which Nvidia thankfully details in a how-to guide.
Nvidia also notes that a number of variables, some of which are simpler to control for than others, can influence how relevant the app’s answers are. These variables include the way the query is phrased, how well the chosen model performs, and the size of the fine-tuning dataset.
Requesting information that is covered in a few documents will probably produce more accurate results than requesting a document or collection of documents’ summaries. Larger datasets will also typically result in better answer quality, according to Nvidia, as will directing Chat with RTX toward additional content related to a certain topic.
The World Economic Forum forecast a “dramatic” increase in the number of reasonably priced devices—such as PCs, cellphones, Internet of Things devices, and networking equipment—that can run GenAI models offline in a recent report. The obvious advantages are the causes, according to the WEF: Offline models are not only less expensive and have lower latency than cloud-hosted models, but they are also intrinsically more private because the data they process never leaves the device they operate on.
totalbulletin
14/02/2024