CN118377844A - Text generation enhancement method and device applied to search enhancement generation - Google Patents
Text generation enhancement method and device applied to search enhancement generation Download PDFInfo
- Publication number
- CN118377844A CN118377844A CN202410512961.4A CN202410512961A CN118377844A CN 118377844 A CN118377844 A CN 118377844A CN 202410512961 A CN202410512961 A CN 202410512961A CN 118377844 A CN118377844 A CN 118377844A
- Authority
- CN
- China
- Prior art keywords
- query
- search
- retrieval
- module
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a retrieval enhancement generation (RAG) technology in the fields of computer software, natural language processing and information retrieval, and improves a text generation process by combining a search technology and a large language model, in particular to a method and a device for enhancing the text generation applied to the retrieval enhancement generation, wherein the method comprises the following steps: s1, query pretreatment and feature extraction; s2, self-adaptive searching and optimizing fine adjustment; s3, historical efficiency analysis and intelligent query routing decision; s4, query conversion, context compression and vectorization retrieval; s5, rearranging and filtering results; s6, responding and synthesizing a final Response. The invention improves the accuracy and efficiency of the retrieval through self-adaptive retrieval optimization, strengthens the consistency and memory capacity of the dialogue context, provides smoother interaction experience, simultaneously retrieves the latest reliable knowledge source, improves the information updating frequency and accuracy of the RAG system, and improves the retrieval efficiency and the accuracy of the retrieval result of the RAG technology.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a retrieval enhancement generation (RAG) technology in the fields of computer software, natural language processing and information retrieval, and a text generation process is improved by combining a search technology and a large language model, and particularly relates to a method and a device for generating and enhancing a text applied to the retrieval enhancement generation.
Background
The retrieval enhancement generation (RAG) technique greatly improves the quality and accuracy of text generation by combining answers generated by a large language model with information retrieved from a data source. Many products are built almost entirely on RAG, covering from web search engines and large language model question-answering services to various data interactive chat applications.
At present, in the processing field of the search enhancement generation technology, the following technologies are mainly covered:
1. The process of document segmentation, vectorization and index establishment to support efficient information retrieval includes segmenting text into small paragraphs, converting into vectors using a transducer model, indexing, and creating context cues for large language models based on the retrieved information to answer user questions.
2. Advanced query conversion techniques, optimizing queries using large language models, and the introduction of chat logic to handle complex conversations, and intelligent query routing to automatically select the most appropriate data source or indexing strategy.
3. And generating a proper answer based on the retrieved context through the intelligent agent and the response synthesizer, and simultaneously considering fine tuning of the encoder and the large language model, so that the overall performance and interaction naturalness of the system are improved.
The comprehensive application of the technology not only enhances the processing capability of the model on complex queries, but also provides smoother and natural interaction experience for users.
The RAG technology is applied to a plurality of products, so that the quality and the accuracy of text generation are greatly improved, but as the technology is still in vigorous development, a plurality of spaces for optimization exist at present:
1. Existing RAG systems typically rely on static external knowledge sources, meaning that they cannot dynamically integrate up-to-date information or adapt to real-time updates of knowledge. This causes a problem that the relevance and accuracy of the generated content decrease with time;
2. the retrieval strategy adopted by the current RAG technology is usually preset and cannot be adaptively adjusted according to the specific nature and the context of the query, so that the retrieval efficiency and the accuracy of the result are low;
3. the prior art lacks an efficient integration method in terms of query conversion and chat logic processing, and particularly does not fully utilize a large language model as an inference engine to automatically process the decomposition of query contents, backtracking prompt and compression of chat context;
4. When the current RAG technology processes inquiry, an intelligent inquiry routing mechanism is lacked to automatically identify and select the most suitable data source or index strategy, so that the retrieval efficiency and accuracy are not optimal;
in summary, the conventional RAG technology still has many defects, so that the retrieval efficiency and the accuracy of the retrieval result of the RAG technology are low.
Disclosure of Invention
In order to improve the retrieval efficiency of the RAG technology and the accuracy of the retrieval result, the invention provides a method and a device for generating and enhancing a text generated by the retrieval enhancement, which aim to solve the core problems of information real-time update, retrieval strategy self-adaptive adjustment, query processing integration, intelligent routing and the like in the RAG technology and improve the retrieval accuracy, efficiency and user interaction experience.
The text generation enhancement method applied to the search enhancement generation relies on a deep learning model and a self-adaptive algorithm, and text contents and queries are converted into vectors through a transducer model and FAISS tools, and efficient indexes are built, so that quick and accurate search of a large-scale database is realized; and the query is analyzed by adopting a natural language processing technology, the retrieval strategy and parameters are dynamically adjusted, and a fine tuning strategy is developed for the scene with few samples. In addition, by the intelligent query routing system, an optimal data source or indexing strategy is selected based on query properties and historical performance data. In query conversion and chat logic, a large language model is used as an inference engine to optimize query processing and chat context compression. And finally, rearranging and filtering the retrieval result by utilizing a post-processor, and combining chat logic processing and intelligent agent interaction to synthesize high-quality response.
In a first aspect, the present invention provides a method for generating and enhancing a text generated by retrieving and enhancing, which adopts the following technical scheme:
A method for text generation enhancements applied to retrieval enhancement generation, comprising the steps of:
s1, query pretreatment and feature extraction;
Adopting a BERT model or a transducer model to convert query and document content into a query vector and a document content vector respectively, constructing an efficient vector index by using a FAISS tool, preprocessing a user query Q, and extracting a query feature F;
Query feature extraction:
F (Q) = { F 1(Q),f2(Q),...,fn (Q) }; wherein f i (Q) represents the ith feature of query Q;
S2, self-adaptive searching and optimizing fine adjustment;
The complexity, the field characteristics and the urgency of the user query are deeply analyzed by utilizing natural language processing NLP, the field and the requirement of the query are identified by comprehensive semantic analysis, and the user query is subjected to semantic analysis and conversion, including rewriting the query and splitting the query into sub-queries Qsu b;
Dynamically adjusting retrieval strategy parameters;
By adjusting nprobe parameters of FAISS to adjust the accuracy and speed balance of the query vector search, and selecting the appropriate hierarchical index level, initially setting a lower nprobe value, then gradually increasing nprobe values until the search accuracy reaches the desired level;
Assuming that Q is a vector representation of the user query, D is a set of representations of the document content vectors, and f (Q, D) is a scoring function representing the similarity between the query and the document; c is the context information of the query, the dynamic parameter adjustment is expressed as a function g (Q, C), the output is the search parameter set P, and the search process can be expressed by the following formula:
R=arg maxD∈S(g(Q,C))f(Q,D);
Wherein S (g (Q, C)) is a search policy and an index range determined based on the search parameter set P;
S3, historical efficiency analysis and intelligent query routing decision;
Through classifying and specific searching the external knowledge sources, learning more domain specific knowledge, monitoring and evaluating the knowledge sources in real time, integrating the latest reliable knowledge sources, generating a local knowledge source search library, recording the addresses of the reliable knowledge sources and updating the addresses regularly;
evaluating historical performance of each data source and index strategy according to the query characteristics F (Q) and the historical performance data H, and predicting the performance P (S|F (Q)) of each strategy on the current query by using a machine learning model;
P(S|F(Q))=MLModel(F(Q));
Wherein MLModel denotes a machine learning model for predicting the efficacy of each strategy, selecting the optimal data source and indexing strategy S based on the efficacy prediction P (s|f (Q));
converting the query Q into a vector V (Q), and waiting for vector retrieval in the corresponding data source using a selected indexing strategy S;
V(Q)=Embed(Q);
wherein, ebed represents the query embedding function;
S4, query conversion, context compression and vectorization retrieval;
Compressing chat context Ctx to obtain key information Ctx core by using an advanced large language model;
constructing a search index to store the vectorized content of the documents and queries;
Vectorizing the sub-query Q sub and the compressed chat context Ctx core, selecting a proper data source DS to perform vectorization retrieval, and rapidly retrieving the most relevant information in the data source DS according to the converted vector representation, wherein the retrieval process can be represented by the following formula:
Results=Search(V(Qsub),V(Ctxcore),DS);
Wherein V (Q sub) and V (Ctx core) represent vector representations of sub-queries and vector representations of compression contexts, respectively, and the Search function represents vectorization Search operation performed according to the selected data source DS, and is output as a Search result set;
s5, rearranging and filtering results;
Each result of the search results is weighted based on similarity scores, keywords and metadata by utilizing a LlamaIndex post-processor, and the final search results are optimized through filtering and re-ranking;
Combining the optimized result set filtered_results with the chat context Ctx;
a. Context update: updating chat context based on filtered_results and original chat context Ctx to form updated_ctx;
b. intelligent agent interaction: on the basis of the context update, the intelligent agent processes and answers the subsequent questions or references of the user according to the updated_ctx;
s6, response synthesis;
based on the optimized search result, a final Response is synthesized by adopting a method of iteratively refining answers, abstracting the searched context or generating a plurality of answers based on different context blocks and then synthesizing a final answer.
Preferably, the preprocessing of the user query Q includes denoising and normalization.
Preferably, the extracted query features F include the length of the query, the keyword density of the query, and the semantic features of the query.
Preferably, the weighting formula in S5 is:
similarity score:
where V (Q) represents a vector representation of the query and V (R) represents a vector representation of the result. ;
Score(R)=w1·Similarity(R,Q)+w2·Complexity(R)+w3.
Relevance(R,Ctx)+w4·KeywordDensity(R);
where R represents a single search result, Q is a query of the user, ctx represents a query context, and w1, w2, w3 and w4 are weights of these factors, respectively, and the results with scores below the threshold are screened out according to the scoring mechanism described above.
In a second aspect, the present invention provides a device for generating and enhancing a text generated by retrieving and enhancing, which adopts the following technical scheme:
an apparatus for text generation enhancement applied to retrieval enhancement generation, comprising the following modules:
The query preprocessing and feature extraction module comprises:
The conversion module is used for converting the query and the document content into a query vector and a document content vector respectively by adopting a BERT model or a Transformer model;
a vector index construction module for constructing an efficient vector index using FAISS tools;
the preprocessing module is used for preprocessing the query vector;
The feature extraction module is used for extracting query features F;
an adaptive search optimization fine tuning module comprising:
The natural language processing NLP module is used for deeply analyzing the complexity, the field characteristics and the urgency of the user query and identifying the field and the requirement of the query through comprehensive semantic analysis;
The analysis and conversion module is used for carrying out semantic analysis and conversion on the user query, and comprises the steps of rewriting the query and splitting the query into sub-queries Q sub;
a dynamic adjustment retrieval strategy parameter module for adjusting the accuracy and speed balance of the query vector search by adjusting nprobe parameters of FAISS, selecting a proper hierarchical index level, initially setting a lower nprobe value, and then gradually increasing nprobe values until the search accuracy reaches a due level;
a historical efficiency analysis and intelligent query route decision module,
The learning module is used for learning more domain specific knowledge through classifying and specific retrieval of the external knowledge source, monitoring and evaluating the knowledge source in real time, integrating the latest reliable knowledge source, generating a local knowledge source retrieval library, recording the address of the reliable knowledge source and updating at regular time;
the evaluation module is used for evaluating the historical performance of each data source and index strategy according to the query characteristics F (Q) and the historical performance data H, and predicting the performance P (S|F (Q)) of each strategy on the current query by using a machine learning model;
The vector conversion and retrieval module converts the query Q into a vector V (Q), and waits for vector retrieval in a corresponding data source by using a selected index strategy S;
a context compressor vectorization retrieval module comprising:
The context compression module is used for compressing the chat context Ctx to obtain key information Ctx core by utilizing the high-level large language model;
A search index construction module for constructing a search index to store the vectorized content of documents and queries;
The vectorization and retrieval module is used for vectorizing the sub-query Q sub and the compressed chat context Ctx core, selecting a proper data source DS to execute vectorization retrieval, and rapidly retrieving the most relevant information in the data source DS according to the converted vector representation;
The result rearrangement and filtering module is used for weighting each result of the search results based on similarity scores, keywords and metadata by utilizing a LlamaIndex post-processor, and optimizing the final search results through filtering and re-ranking;
The combination module is used for combining the optimized result set filtered_results with the chat context Ctx;
a. Context update: updating chat context based on filtered_results and original chat context Ctx to form updated_ctx;
b. intelligent agent interaction: on the basis of the context update, the intelligent agent processes and answers the subsequent questions or references of the user according to the updated_ctx;
And the Response module is used for synthesizing a final Response by adopting a method of iteratively refining answers, abstracting the searched context or generating a plurality of answers based on different context blocks and then synthesizing the final answer based on the optimized search result.
Preferably, the preprocessing module includes:
The denoising module is used for denoising the query vector;
And the normalization module is used for normalizing the query vector.
Preferably, the feature extraction module includes:
A length feature extraction module for extracting the length of the query feature F
The keyword density extraction module is used for extracting keyword density of the query feature F;
The semantic feature extraction module is used for extracting semantic features of the query feature F.
In a third aspect, the present application provides an electronic device, which adopts the following technical scheme:
An electronic device, comprising:
One or more processors;
A memory;
One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to: a method of generating enhancements for text applied to retrieval enhancement generation as described in the first aspect above is performed.
In a fourth aspect, the present invention provides a device for generating and enhancing a text generated by retrieving and enhancing, which adopts the following technical scheme:
A computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method for retrieving text generation enhancements generated by enhancement as claimed in any one of the first aspects above.
In summary, the invention has the following beneficial technical effects:
1. the self-adaptive search optimization improves the accuracy and efficiency of search.
2. The invention strengthens the continuity and the memory capacity of the dialogue context through the processes of context compression, combination and the like, and provides smoother interaction experience.
3. The invention improves the information updating frequency and accuracy of the RAG system and improves the retrieval efficiency of the RAG technology and the accuracy of the retrieval result by retrieving the latest reliable knowledge source.
4. According to the invention, manual maintenance is replaced by automatic maintenance, so that the labor intensity is reduced, and the labor cost is reduced.
Drawings
FIG. 1 is a flow chart of a method for text generation enhancement applied to retrieval enhancement generation in an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The embodiment of the invention discloses a method for generating and enhancing text, which is applied to retrieval enhancement generation.
Referring to fig. 1, the method includes the steps of:
s1, query pretreatment and feature extraction;
Adopting a BERT model or a transducer model to convert query and document content into a query vector and a document content vector respectively, constructing an efficient vector index by using a FAISS tool, preprocessing a user query Q, and extracting a query feature F;
In order to improve the retrieval efficiency, particularly when a large-scale database is faced, document blocks with higher relevance are quickly screened out by constructing a summary index, and then the document blocks are searched in depth.
The preprocessing of the user query Q comprises the operations of denoising, normalization and the like.
Wherein query feature extraction may be represented as follows:
F(Q)={f1(Q),f2(Q),...,fn(Q)};
Wherein f i (Q) represents the ith feature of query Q;
The extracted query features F comprise features such as the length of the query, the keyword density of the query, the semantic features of the query and the like, and necessary input information is provided for a subsequent self-adaptive retrieval strategy.
Finally, semantic understanding and automatic decomposition are carried out on the query by utilizing a large language model, the complex query is automatically decomposed into sub-queries Qsub which are easier to process, and backtracking prompts are generated to enrich the context information of the query.
S2, self-adaptive searching and optimizing fine adjustment;
The complexity, the field characteristics and the urgency of the user query are deeply analyzed by utilizing natural language processing NLP, the field and the requirement of the query are identified through comprehensive semantic analysis, and a basis is provided for subsequent retrieval strategy adjustment. This process involves semantic parsing and conversion of the user query, including rewriting the query and splitting it into more manageable sub-queries Q sub.
Dynamically adjusting retrieval strategy parameters;
And further dynamically adjusting the retrieval strategy parameters based on the results of the query analysis. This includes initially setting a lower nprobe value to ensure the speed of the search by adjusting nprobe parameters of FAISS to adjust the accuracy and speed balance of the query vector search, and selecting the appropriate hierarchical index level, then gradually increasing nprobe values until the search accuracy reaches the desired level;
aiming at a few sample scene, a fine tuning strategy is developed, a pre-training model which is well represented on a source task is selected to carry out feature extractor migration through a migration learning technology, fine tuning is carried out on the model, and the representation of the model in a specific field is enhanced.
Assuming that Q is a vector representation of the user query, D is a set of representations of the document content vectors, and f (Q, D) is a scoring function representing the similarity between the query and the document; c is the context information of the query, the dynamic parameter adjustment is expressed as a function g (Q, C), the output is the search parameter set P, and the search process can be expressed by the following formula:
R=arg maxD∈S(g(Q,C))f(Q,D);
Wherein S (g (Q, C)) is a search policy and an index range determined based on the search parameter set P;
S3, historical efficiency analysis and intelligent query routing decision;
by realizing an intelligent query routing system, the most suitable data source or index strategy is automatically selected, an advanced query routing mechanism is developed, and the selection is optimized according to the nature and the historical efficiency data of the query, so that the retrieval efficiency and accuracy are improved, and finally, higher-quality response is provided for users.
Through classifying and specific searching the external knowledge sources, learning more domain specific knowledge, monitoring and evaluating the knowledge sources in real time, integrating the latest reliable knowledge sources, generating a local knowledge source search library, recording the addresses of the reliable knowledge sources and updating the addresses regularly;
evaluating historical performance of each data source and index strategy according to the query characteristics F (Q) and the historical performance data H, and predicting the performance P (S|F (Q)) of each strategy on the current query by using a machine learning model;
P(SF(Q))=MLModel(F(Q));
Wherein MLModel denotes a machine learning model for predicting the efficacy of each strategy, selecting the optimal data source and indexing strategy S based on the efficacy prediction P (s|f (Q));
converting the query Q into a vector V (Q), and waiting for vector retrieval in the corresponding data source using a selected indexing strategy S;
V(Q)=Embed(Q);
wherein, ebed represents the query embedding function;
S4, query conversion, context compression and vectorization retrieval;
In the query conversion and chat logic, by using a large language model as an innovation of an inference engine, the RAG interaction quality and efficiency are further improved by utilizing the automatic decomposition and backtracking prompt technology of query contents and the compression technology of chat contexts.
In order to optimize the processing efficiency of the chat engine and enhance the accuracy of information, firstly, compressing chat context Ctx, and compressing the chat context Ctx into key information Ctx core by using a high-level large language model;
the above-described process involves extracting contextual information most relevant to the current query to facilitate more focused and efficient subsequent query processing and information retrieval.
Then constructing search indexes to store vectorized contents of documents and queries, and adopting various vector index methods including fasss, nmselib and annoy for adapting to data sets with different scales and improving the searching speed, wherein the methods realize quick and accurate searching by utilizing an approximate nearest neighbor algorithm. Meanwhile, in order to efficiently manage a large-scale document database, a hierarchical index strategy is adopted, document blocks with higher relevance are firstly and rapidly screened out through an abstract index, and then deep searching is carried out on the document blocks.
Vectorizing the sub-query Q sub and the compressed chat context Ctx core, selecting a proper data source DS to perform vectorization retrieval, and rapidly retrieving the most relevant information in the data source DS according to the converted vector representation, wherein the retrieval process can be represented by the following formula:
Results=Search(V(Qsub),V(Ctxcore),DS),
Wherein V (Q sub) and V (Ctx core) represent vector representations of sub-queries and vector representations of compression contexts, respectively, and the Search function represents vectorization Search operation performed according to the selected data source DS, and is output as a Search result set;
s5, rearranging and filtering results;
Each result of the search results is weighted based on similarity scores, keywords and metadata by utilizing a LlamaIndex post-processor, and the final search results are optimized through filtering and re-ranking;
The weighting processing process comprises the following steps:
similarity score:
where V (Q) represents a vector representation of the query and V (R) represents a vector representation of the result. ;
Score(R)=w1·Similarity(R,Q)+w2·Complexity(R)+w3.
Relevance(R,Ctx)+w4·KeywordDensity(R);
Where R represents a single search result, Q is a query of the user, ctx represents a query context, and w1, w2, w3 and w4 are weights of these factors, respectively, and the results with scores below the threshold are screened out according to the scoring mechanism described above. Ensuring that each item in the output result set Filtered Results has a high correlation and quality.
And combining the optimized result set filtered_results with the chat context Ctx, and further processing subsequent questions and designations of the user so as to improve interaction consistency and user satisfaction.
A. Context update: updating chat context based on filtered_results and original chat context Ctx to form updated_ctx; including integrating new information points and adjusting the focus of the context in order to more accurately respond to subsequent user queries.
B. Intelligent agent interaction: based on the context update, the intelligent agent processes and answers the subsequent questions or the designations of the users according to the updated_ctx, so that the whole chat process is more natural and smooth.
S6, response synthesis;
based on the optimized search result, a final Response is synthesized by adopting a method of iteratively refining answers, abstracting the searched context or generating a plurality of answers based on different context blocks and then synthesizing a final answer.
The embodiment of the application also discloses a device for generating and enhancing the text generated by the retrieval enhancement, which comprises the following modules:
the query preprocessing and feature extraction module is used for query preprocessing and feature extraction;
The self-adaptive search optimization fine tuning module is used for self-adaptive search optimization fine tuning;
The historical efficiency analysis and intelligent query route decision module is used for analyzing the historical efficiency and intelligent query route decision;
The context compressor vectorization retrieval module is used for compressing the context and vectorizing retrieval;
The result rearrangement and filtering module is used for rearranging and filtering the Results and combining the optimized result set filtered_results with the chat context Ctx;
and the response module is used for responding to the final result.
The query preprocessing and feature extraction module comprises:
The conversion module is used for converting the query and the document content into a query vector and a document content vector respectively by adopting a BERT model or a Transformer model;
a vector index construction module for constructing an efficient vector index using FAISS tools;
the preprocessing module is used for preprocessing the query vector;
The preprocessing module comprises:
The denoising module is used for denoising the query vector;
And the normalization module is used for normalizing the query vector.
The feature extraction module is used for extracting query features F;
The feature extraction module comprises:
A length feature extraction module for extracting the length of the query feature F
The keyword density extraction module is used for extracting keyword density of the query feature F;
The semantic feature extraction module is used for extracting semantic features of the query feature F.
Wherein, the self-adaptive retrieval optimization fine tuning module comprises:
The natural language processing NLP module is used for deeply analyzing the complexity, the field characteristics and the urgency of the user query and identifying the field and the requirement of the query through comprehensive semantic analysis;
The analysis and conversion module is used for carrying out semantic analysis and conversion on the user query, and comprises the steps of rewriting the query and splitting the query into sub-queries Q sub;
a dynamic adjustment retrieval strategy parameter module for adjusting the accuracy and speed balance of the query vector search by adjusting nprobe parameters of FAISS, selecting a proper hierarchical index level, initially setting a lower nprobe value, and then gradually increasing nprobe values until the search accuracy reaches a due level;
The historical efficiency analysis and intelligent query routing decision module comprises:
the learning module is used for learning more domain specific knowledge through classifying and specific retrieval of the external knowledge source, monitoring and evaluating the knowledge source in real time, integrating the latest reliable knowledge source, generating a local knowledge source retrieval library, recording the address of the reliable knowledge source and updating at regular time;
the evaluation module is used for evaluating the historical performance of each data source and index strategy according to the query characteristics F (Q) and the historical performance data H, and predicting the performance P (S|F (Q)) of each strategy on the current query by using a machine learning model;
the vector conversion and retrieval module is used for converting the query Q into a vector V (Q) and waiting for vector retrieval in a corresponding data source by using a selected index strategy S;
wherein the context compressor vectorization retrieval module comprises:
The context compression module is used for compressing the chat context Ctx to obtain key information Ctx core by utilizing the high-level large language model;
A search index construction module for constructing a search index to store the vectorized content of documents and queries;
The vectorization and retrieval module vectorizes the sub-query Q sub and the compressed chat context Ctx core, selects a proper data source DS to execute vectorization retrieval, and rapidly retrieves the most relevant information in the data source DS according to the converted vector representation;
The result rearrangement and filtering module is specifically used for utilizing a LlamaIndex post-processor, weighting each result of the search results based on similarity scores, keywords and metadata, and optimizing the final search results through filtering and re-ranking;
The result rearrangement and filtering module further comprises a combination module for combining the optimized result set filtered_results with the chat context Ctx;
a. Context update: updating chat context based on filtered_results and original chat context Ctx to form updated_ctx;
b. intelligent agent interaction: on the basis of the context update, the intelligent agent processes and answers the subsequent questions or references of the user according to the updated_ctx;
The Response module is specifically configured to synthesize a final Response by adopting a method of iteratively refining answers, abstracting the retrieved context, or generating a plurality of answers based on different context blocks and then synthesizing a final answer based on the optimized search result.
The embodiment of the invention also provides electronic equipment, which comprises: a processor and a memory. Wherein the processor is coupled to the memory, such as via a bus. Optionally, the electronic device may further comprise a transceiver. It should be noted that, in practical applications, the transceiver is not limited to one, and the structure of the electronic device does not limit the embodiments of the present invention.
The processor may be a CPU central processing unit, a general purpose processor, a DSP data signal processor, an ASIC specific integrated circuit, an FPGA field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules and circuits described in connection with this disclosure. A processor may also be a combination that performs computing functions, e.g., including one or more microprocessors, a combination of a DSP and a microprocessor, and the like.
A bus may include a path that communicates information between the components. The bus may be a PCI peripheral component interconnect standard bus or an EISA extension industry standard architecture bus, etc. The buses may be divided into address buses, data buses, control buses, etc.
The memory may be, but is not limited to, ROM read-only memory or other type of static storage device that can store static information and instructions, RAM random-access memory or other type of dynamic storage device that can store information and instructions, EEPROM electrically erasable programmable read-only memory, CD-ROM read-only or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The memory is used for storing application program codes for executing the scheme of the invention, and the execution is controlled by the processor. The processor is configured to execute the application code stored in the memory to implement what is shown in one of the methods disclosed in the embodiments described above as being applied to text generation enhancements generated by retrieval enhancements.
Among them, electronic devices include, but are not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. But may also be a server or the like.
Embodiments of the present invention provide a computer-readable storage medium having a computer program stored thereon, which when run on a computer, causes the computer to perform a corresponding content in a method for text generation enhancement applied to retrieval enhancement generation as disclosed in the above embodiments.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
The above embodiments are not intended to limit the scope of the present invention, so: all equivalent changes in structure, shape and principle of the invention should be covered in the scope of protection of the invention.
Claims (9)
1. A method for text generation enhancement applied to search enhancement generation, comprising the steps of:
s1, query pretreatment and feature extraction;
Adopting a BERT model or a transducer model to convert query and document content into a query vector and a document content vector respectively, constructing an efficient vector index by using a FAISS tool, preprocessing a user query Q, and extracting a query feature F;
Query feature extraction:
F(Q)={f1(Q),f2(Q),...,fn(Q)};
Wherein f i (Q) represents the ith feature of query Q;
S2, self-adaptive searching and optimizing fine adjustment;
The complexity, the field characteristics and the urgency of the user query are deeply analyzed by utilizing natural language processing NLP, the field and the requirement of the query are identified through comprehensive semantic analysis, and the user query is subjected to semantic analysis and conversion, including rewriting the query and splitting the query into sub-queries Q sub;
Dynamically adjusting retrieval strategy parameters;
By adjusting nprobe parameters of FAISS to adjust the accuracy and speed balance of the query vector search, and selecting the appropriate hierarchical index level, initially setting a lower nprobe value, then gradually increasing nprobe values until the search accuracy reaches the desired level;
Assuming that Q is a vector representation of the user query, D is a set of representations of the document content vectors, and f (Q, D) is a scoring function representing the similarity between the query and the document; c is the context information of the query, the dynamic parameter adjustment is expressed as a function g (Q, C), the output is the search parameter set P, and the search process can be expressed by the following formula:
R=argmaxD∈S(g(Q,C))f(Q,D);
Wherein S (g (Q, C)) is a search policy and an index range determined based on the search parameter set P;
S3, historical efficiency analysis and intelligent query routing decision;
Through classifying and specific searching the external knowledge sources, learning more domain specific knowledge, monitoring and evaluating the knowledge sources in real time, integrating the latest reliable knowledge sources, generating a local knowledge source search library, recording the addresses of the reliable knowledge sources and updating the addresses regularly;
evaluating historical performance of each data source and index strategy according to the query characteristics F (Q) and the historical performance data H, and predicting the performance P (S|F (Q)) of each strategy on the current query by using a machine learning model;
P(S|F(Q))=MLModel(F(Q));
Wherein MLModel denotes a machine learning model for predicting the efficacy of each strategy, selecting the optimal data source and indexing strategy S based on the efficacy prediction P (s|f (Q));
converting the query Q into a vector V (Q), and waiting for vector retrieval in the corresponding data source using a selected indexing strategy S;
V(Q)=Embed(Q);
wherein, ebed represents the query embedding function;
S4, query conversion, context compression and vectorization retrieval;
Compressing chat context Ctx to obtain key information Ctx core by using an advanced large language model;
constructing a search index to store the vectorized content of the documents and queries;
Vectorizing the sub-query Q sub and the compressed chat context Ctx core, selecting a proper data source DS to perform vectorization retrieval, and rapidly retrieving the most relevant information in the data source DS according to the converted vector representation, wherein the retrieval process can be represented by the following formula:
Results=Search(V(Qsub),V(Ctxcore),DS);
Wherein V (Q sub) and V (Ctx core) represent vector representations of sub-queries and vector representations of compression contexts, respectively, and the Search function represents vectorization Search operation performed according to the selected data source DS, and is output as a Search result set;
s5, rearranging and filtering results;
Each result of the search results is weighted based on similarity scores, keywords and metadata by utilizing a LlamaIndex post-processor, and the final search results are optimized through filtering and re-ranking;
Combining the optimized result set filtered_results with the chat context Ctx;
a. Context update: updating chat context based on filtered_results and original chat context Ctx to form updated_ctx;
b. intelligent agent interaction: on the basis of the context update, the intelligent agent processes and answers the subsequent questions or references of the user according to the updated_ctx;
s6, response synthesis;
based on the optimized search result, a final Response is synthesized by adopting a method of iteratively refining answers, abstracting the searched context or generating a plurality of answers based on different context blocks and then synthesizing a final answer.
2. The method for text generation enhancement for search enhancement generation according to claim 1, wherein said preprocessing of user query Q includes denoising and normalization.
3. The method of claim 1, wherein the extracted query features F comprise a length of the query, a keyword density of the query, and semantic features of the query.
4. The method of claim 1, wherein the weighting formula in S5 is:
similarity score:
where V (Q) represents a vector representation of the query and V (R) represents a vector representation of the result. ;
Score(R)=w1·Similarity(R,Q)+w2·Complexity(R)+w3.;
Relevance(R,Ctx)+w4·KeywordDensity(R);
where R represents a single search result, Q is a query of the user, ctx represents a query context, and w1, w2, w3 and w4 are weights of these factors, respectively, and the results with scores below the threshold are screened out according to the scoring mechanism described above.
5. An apparatus for text generation enhancement applied to search enhancement generation, comprising the following modules:
The query preprocessing and feature extraction module comprises:
The conversion module is used for converting the query and the document content into a query vector and a document content vector respectively by adopting a BERT model or a Transformer model;
a vector index construction module for constructing an efficient vector index using FAISS tools;
the preprocessing module is used for preprocessing the query vector;
The feature extraction module is used for extracting query features F;
an adaptive search optimization fine tuning module comprising:
The natural language processing NLP module is used for deeply analyzing the complexity, the field characteristics and the urgency of the user query and identifying the field and the requirement of the query through comprehensive semantic analysis;
The analysis and conversion module is used for carrying out semantic analysis and conversion on the user query, and comprises the steps of rewriting the query and splitting the query into sub-queries Q sub;
a dynamic adjustment retrieval strategy parameter module for adjusting the accuracy and speed balance of the query vector search by adjusting nprobe parameters of FAISS, selecting a proper hierarchical index level, initially setting a lower nprobe value, and then gradually increasing nprobe values until the search accuracy reaches a due level;
The historical efficiency analysis and intelligent query route decision module comprises:
the learning module is used for learning more domain specific knowledge through classifying and specific retrieval of the external knowledge source, monitoring and evaluating the knowledge source in real time, integrating the latest reliable knowledge source, generating a local knowledge source retrieval library, recording the address of the reliable knowledge source and updating at regular time;
the evaluation module is used for evaluating the historical performance of each data source and index strategy according to the query characteristics F (Q) and the historical performance data H, and predicting the performance P (S|F (Q)) of each strategy on the current query by using a machine learning model;
the vector conversion and retrieval module is used for converting the query Q into a vector V (Q) and waiting for vector retrieval in a corresponding data source by using a selected index strategy S;
a context compressor vectorization retrieval module comprising:
The context compression module is used for compressing the chat context Ctx to obtain key information Ctx core by utilizing the high-level large language model;
A search index construction module for constructing a search index to store the vectorized content of documents and queries;
The vectorization and retrieval module vectorizes the sub-query Q sub and the compressed chat context Ctx core, selects a proper data source DS to execute vectorization retrieval, and rapidly retrieves the most relevant information in the data source DS according to the converted vector representation;
The result rearrangement and filtering module is used for weighting each result of the search results based on similarity scores, keywords and metadata by utilizing a LlamaIndex post-processor, and optimizing the final search results through filtering and re-ranking;
The combination module is used for combining the optimized result set filtered_results with the chat context Ctx;
a. Context update: updating chat context based on filtered_results and original chat context Ctx to form updated_ctx;
b. intelligent agent interaction: on the basis of the context update, the intelligent agent processes and answers the subsequent questions or references of the user according to the updated_ctx;
And the Response module is used for synthesizing a final Response by adopting a method of iteratively refining answers, abstracting the searched context or generating a plurality of answers based on different context blocks and then synthesizing the final answer based on the optimized search result.
6. An apparatus for text generation enhancement applied to search enhancement generation according to claim 5, wherein said preprocessing module comprises:
The denoising module is used for denoising the query vector;
And the normalization module is used for normalizing the query vector.
7. An apparatus for text generation enhancement applied to search enhancement generation according to claim 5, wherein the feature extraction module comprises:
A length feature extraction module for extracting the length of the query feature F
The keyword density extraction module is used for extracting keyword density of the query feature F;
The semantic feature extraction module is used for extracting semantic features of the query feature F.
8. An electronic device, comprising:
One or more processors;
A memory;
One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to: a method of performing text generation enhancements for use in retrieval enhancement generation according to any of claims 1 to 7.
9. A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements a method for text generation enhancement for search enhancement generation according to any of claims 1 to 7.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410512961.4A CN118377844B (en) | 2024-04-26 | 2024-04-26 | Text generation enhancement method and device applied to search enhancement generation |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410512961.4A CN118377844B (en) | 2024-04-26 | 2024-04-26 | Text generation enhancement method and device applied to search enhancement generation |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118377844A true CN118377844A (en) | 2024-07-23 |
| CN118377844B CN118377844B (en) | 2024-10-08 |
Family
ID=91901193
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410512961.4A Active CN118377844B (en) | 2024-04-26 | 2024-04-26 | Text generation enhancement method and device applied to search enhancement generation |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118377844B (en) |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118779437A (en) * | 2024-09-12 | 2024-10-15 | 北京市大数据中心 | A large model RAG method based on hierarchical indexing and hybrid retrieval |
| CN119128045A (en) * | 2024-09-10 | 2024-12-13 | 山东财经大学 | Optimization Methods of Index Retrieval in Digital Library |
| CN119271706A (en) * | 2024-12-10 | 2025-01-07 | 中科南京人工智能创新研究院 | A method for extracting external data for retrieval-enhanced generation system |
| CN119336863A (en) * | 2024-12-20 | 2025-01-21 | 博思数采科技股份有限公司 | A procurement knowledge retrieval method and system based on large language model |
| CN119961378A (en) * | 2025-01-10 | 2025-05-09 | 北京炼石网络技术有限公司 | RAG data processing method and system for data security |
| CN120011553A (en) * | 2025-01-14 | 2025-05-16 | 广东外语外贸大学 | A retrieval enhancement generation method and system based on historical information drive |
| CN120541206A (en) * | 2025-07-28 | 2025-08-26 | 浪潮通信信息系统有限公司 | Search enhancement generation method and device |
| CN120821818A (en) * | 2025-09-18 | 2025-10-21 | 北京中数睿智科技有限公司 | Automatic adjustment method of retrieval enhancement generation parameters based on content feature modeling |
| CN121071060A (en) * | 2025-11-06 | 2025-12-05 | 深圳市明心数智科技有限公司 | Content hierarchical weighting-based retrieval enhancement generation method, device and storage medium |
| KR102900995B1 (en) * | 2024-08-14 | 2025-12-19 | 주식회사 크랩스 | Apparatus and Method for Producing Video Content that Matches the User's Intention with High Accuracy |
| WO2026056764A1 (en) * | 2024-09-12 | 2026-03-19 | 阿里巴巴(中国)有限公司 | Human-computer dialogue method, server, storage medium and program product |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116628171A (en) * | 2023-07-24 | 2023-08-22 | 北京惠每云科技有限公司 | A medical record retrieval method and system based on a pre-trained language model |
-
2024
- 2024-04-26 CN CN202410512961.4A patent/CN118377844B/en active Active
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116628171A (en) * | 2023-07-24 | 2023-08-22 | 北京惠每云科技有限公司 | A medical record retrieval method and system based on a pre-trained language model |
Non-Patent Citations (1)
| Title |
|---|
| 沈思;李成名;吴鹏;: "基于时态语义的Web信息检索实践进展与研究综述", 中国图书馆学报, no. 04, 15 July 2018 (2018-07-15), pages 111 - 131 * |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102900995B1 (en) * | 2024-08-14 | 2025-12-19 | 주식회사 크랩스 | Apparatus and Method for Producing Video Content that Matches the User's Intention with High Accuracy |
| CN119128045A (en) * | 2024-09-10 | 2024-12-13 | 山东财经大学 | Optimization Methods of Index Retrieval in Digital Library |
| CN118779437A (en) * | 2024-09-12 | 2024-10-15 | 北京市大数据中心 | A large model RAG method based on hierarchical indexing and hybrid retrieval |
| WO2026056764A1 (en) * | 2024-09-12 | 2026-03-19 | 阿里巴巴(中国)有限公司 | Human-computer dialogue method, server, storage medium and program product |
| CN119271706A (en) * | 2024-12-10 | 2025-01-07 | 中科南京人工智能创新研究院 | A method for extracting external data for retrieval-enhanced generation system |
| CN119336863A (en) * | 2024-12-20 | 2025-01-21 | 博思数采科技股份有限公司 | A procurement knowledge retrieval method and system based on large language model |
| CN119961378A (en) * | 2025-01-10 | 2025-05-09 | 北京炼石网络技术有限公司 | RAG data processing method and system for data security |
| CN120011553A (en) * | 2025-01-14 | 2025-05-16 | 广东外语外贸大学 | A retrieval enhancement generation method and system based on historical information drive |
| CN120541206A (en) * | 2025-07-28 | 2025-08-26 | 浪潮通信信息系统有限公司 | Search enhancement generation method and device |
| CN120821818A (en) * | 2025-09-18 | 2025-10-21 | 北京中数睿智科技有限公司 | Automatic adjustment method of retrieval enhancement generation parameters based on content feature modeling |
| CN121071060A (en) * | 2025-11-06 | 2025-12-05 | 深圳市明心数智科技有限公司 | Content hierarchical weighting-based retrieval enhancement generation method, device and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118377844B (en) | 2024-10-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN118377844B (en) | Text generation enhancement method and device applied to search enhancement generation | |
| CN116719520A (en) | Code generation method and device | |
| CN119003742B (en) | Digital intelligent power regulation question-answering method and system based on large language model | |
| CN116911312A (en) | Task type dialogue system and implementation method thereof | |
| CN118277521A (en) | An intelligent question-answering method, system, device and medium in the electric power field based on LLM | |
| CN119719312B (en) | Intelligent government affair question-answering method, device, equipment and storage medium | |
| JP7842236B2 (en) | Generating output sequences with inline evidence using language model neural networks | |
| CN117931983A (en) | Method and system for generating accurate answer by using large model | |
| CN118246540B (en) | Interaction method, device, equipment and storage medium | |
| CN119377372B (en) | Answer acquisition method, computer program product, device and storage medium | |
| CN120045750A (en) | Retrieval enhancement generation method and system based on large language model | |
| CN120179801A (en) | A retrieval-enhanced semantic command response method based on knowledge fusion | |
| CN120296146A (en) | Government document citation retrieval method, device, equipment and medium based on big model | |
| CN118394894A (en) | End-to-end agent financial model method suitable for multi-mode financial analysis task | |
| CN119088915A (en) | Integrated retrieval dialogue method, system and medium based on large model knowledge base | |
| CN119377369A (en) | A multimodal RAG, device, equipment and storage medium based on a large model | |
| CN120316235A (en) | Conversational enterprise knowledge question-answering method, device, storage medium and electronic device | |
| CN120123489A (en) | A knowledge base-based adaptive hybrid retrieval enhanced question answering method and device | |
| CN118095272A (en) | Text recognition method and device, electronic equipment and storage medium | |
| CN116501842B (en) | Training methods and devices for dialogue models | |
| CN120541213A (en) | Data processing method and device | |
| CN119884333A (en) | Text generation method, storage medium, electronic device, and program product | |
| CN118013020B (en) | Patent query method and system for generating joint training based on retrieval | |
| CN120181097A (en) | A multi-stage semantic reordering fine-tuning method and device for vertical fields | |
| WO2025251369A1 (en) | Document processing method and apparatus, and electronic device and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |