WO2023082305A1 - 兼容双测序平台的建库元件、试剂盒及建库方法 - Google Patents

兼容双测序平台的建库元件、试剂盒及建库方法 Download PDF

Info

Publication number
WO2023082305A1
WO2023082305A1 PCT/CN2021/131508 CN2021131508W WO2023082305A1 WO 2023082305 A1 WO2023082305 A1 WO 2023082305A1 CN 2021131508 W CN2021131508 W CN 2021131508W WO 2023082305 A1 WO2023082305 A1 WO 2023082305A1
Authority
WO
WIPO (PCT)
Prior art keywords
library
seq
terminal
index
index sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/131508
Other languages
English (en)
French (fr)
Inventor
汪彪
胡玉刚
吴强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanodigmbio Nanjing Biotechnology Co Ltd
Original Assignee
Nanodigmbio Nanjing Biotechnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanodigmbio Nanjing Biotechnology Co Ltd filed Critical Nanodigmbio Nanjing Biotechnology Co Ltd
Priority to EP21951111.0A priority Critical patent/EP4202058A4/en
Priority to US18/016,857 priority patent/US20240279647A1/en
Publication of WO2023082305A1 publication Critical patent/WO2023082305A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention relates to the field of high-throughput sequencing library construction, in particular to a library construction component, a kit and a library construction method compatible with dual sequencing platforms.
  • High-throughput sequencing is to obtain sequence information by determining nucleic acid sequences.
  • the mainstream next-generation sequencers are Illumina, MGI and life sequencers, of which Illumina is the mainstream, followed by MGI sequencers.
  • these two sequencers sequence The principle is the same, and the reading of the nucleic acid sequence is realized through the method of sequencing while synthesizing, and marking the single nucleotide in the synthesis process with different fluorescent signals.
  • life's sequencer realizes detection through the release of electronic signals during the synthesis process.
  • Illumina sequencers Since the sequencing of Illumina and MGI relies on fluorescently labeled single nucleotides, with the development of technology, Illumina sequencers have developed three fluorescent channels and corresponding camera modes: four-color fluorescent four-channel mode, three-color fluorescent two-channel mode and four-color fluorescent color single channel mode.
  • the MGI sequencer also has a four-color fluorescence four-channel mode and a three-color fluorescence two-channel mode. Therefore, since these sequencing modes all need to detect fluorescent signals, since the fluorescent signals will overlap with each other, filters are required to avoid mutual interference as much as possible during detection. Another solution is to try to base as much as possible during mixed testing.
  • Balanced machine sequencing, similarly, Index sequence also needs to consider the balance issue. If the balance problem is considered in the design of Index, there is no need to think hard about the computer problem when arranging various channels.
  • Illumina's on-machine method is linear amplification for on-machine sequencing
  • MGI's on-machine method is on-machine sequencing after circularization, so if you want to put the Illumina library on the machine, you need to process it separately before you can circle it.
  • the current processing method is through PCR amplification.
  • the main purpose of the present invention is to provide a library building component, a kit and a library building method compatible with dual sequencing platforms, so as to solve the problem of low compatibility of the library building solutions in the prior art on the two sequencing platforms.
  • a library construction method compatible with dual sequencing platforms includes: using primers or adapters with 5' phosphorylation modifications to perform library construction on target samples, Obtain a linear amplification library with 5' phosphorylation modification, and the linear amplification library with 5' phosphorylation modification is a linear library suitable for the Illumin sequencing platform; or further convert the linear amplification library with 5' phosphorylation modification
  • the amplified library is circularized to obtain a circularized library suitable for the MGI sequencing platform; wherein, the 5' phosphorylation-modified primers include P5 truncated amplification primer SEQ ID NO: 1 and P7 truncated amplification primer SEQ ID NO: 2; 5' phosphorylated linkers include P5 full-length linker SEQ ID NO: 3 and P7 full-length linker SEQ ID NO: 4; wherein, SEQ ID NO: 1: /5Phos/AATGATACGGCG
  • index sequence at the P5 terminal is selected from any one in Table 1-1
  • index sequence at the P7 terminal is selected from any one in Table 1-2.
  • the P5-terminal index sequences corresponding to multiple target samples are selected from any set of 4-base balanced tag sequences in Table 1-1, and the P7-terminal index sequences corresponding to multiple target samples are selected from Table 1-1.
  • Any set of 4-base balanced tag sequences in 1-2 the 4-base balanced tag sequence refers to a group of 4 tag sequences balanced, that is, at each position from the 1st to the 10th position of the tag sequence , one each of the bases A, T, G, and C.
  • obtaining a linear amplification library with 5' phosphorylation modifications includes: using SEQ ID NO: 7 and SEQ ID NO: 8
  • the truncated adapter is used to connect the fragments derived from the target sample to obtain a fragment with an adapter; the 5' phosphorylation-modified primers shown in SEQ ID NO: 1 and SEQ ID NO: 2 are used to amplify the fragment with an adapter, A linear amplification library with 5' phosphorylation modification was obtained; wherein, SEQ ID NO: 7: ACACTCTTTCCCTACACGACGCTCTTCCGATC*T, * represents sulfur modification; SEQ ID NO: 8: /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC.
  • 5' phosphorylation-modified adapters for library construction of target samples to obtain linear amplified libraries with 5' phosphorylation modifications includes: using the full sequence shown in SEQ ID NO: 3 and SEQ ID NO: 4
  • the long-type adapter is used to connect the fragments from the target sample to obtain a library with an adapter; the library with an adapter is amplified using the library amplification primers shown in SEQ ID NO: 5 and SEQ ID NO: 6 to obtain a 5' phosphate A modified linear amplification library; wherein, SEQ ID NO: 5: /5Phos/AATGATACGGCGACCACCGAGAT; SEQ ID NO: 6: CAAGCAGAAGACGGCATACGA.
  • the library construction method also includes the step of performing target capture on the linear amplification library; preferably, using 5' phosphorylation-modified library amplification primers to perform target capture on the capture library after target capture.
  • Amplify to obtain a linear amplification capture library circularize the linear amplification capture library to obtain a circular library suitable for the MGI sequencing platform; preferably, the 5' phosphorylation-modified library amplification primers include SEQ ID NO: 5 The P5 phosphorylation primer shown, and the P7 primer shown in SEQ ID NO:6.
  • a library construction kit compatible with dual sequencing platforms includes any one of the following combinations: 1) Combination 1: P5 truncated amplification primer SEQ ID NO: 1 and P7 truncated amplification primer SEQ ID NO: 2, wherein, SEQ ID NO: 1: /5Phos/AATGATACGGCGACCACCGAGATCTACANNNNNNNNACACTCTTTCCCTACACGAC, 10 Ns represent the index sequence of P5 terminal; SEQ ID NO: 2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNNGTGACTGGA GTTCAGACGTGT, 10 N for P7 terminal index sequence; 2) Combination 2: P5 full-length linker SEQ ID NO: 3 and P7 full-length linker SEQ ID NO: 4, wherein, SEQ ID NO: 3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T, 10 N
  • index sequence at the P5 terminal is selected from any one in Table 1-1
  • index sequence at the P7 terminal is selected from any one in Table 1-2.
  • the library construction kit includes 412 P5-terminal index sequences and 432 P7-terminal index sequences.
  • the P5-terminal index sequences are shown in Table 1-1, and the P7-terminal index sequences are shown in Table 1-2.
  • the index sequence and/or the P7-terminal index sequence are used in conjunction with a set of 4-base balanced tag sequences.
  • the library construction kit also includes library amplification primers shown in SEQ ID NO: 5 and SEQ ID NO: 6, and/or truncated adapters shown in SEQ ID NO: 7 and 8.
  • a library building element compatible with dual sequencing platforms is provided, the library building element is selected from any one of the following combinations: 1) combination 1: P5 truncated amplification primer SEQ ID NO : 1 and P7 truncated amplification primer SEQ ID NO: 2, wherein, SEQ ID NO: 1: /5Phos/AATGATACGGCGACCACCGAGATCTACANNNNNNNNACACTCTTTCCCTACACGACGAC, 10 N represents the P5 terminal index sequence; SEQ ID NO: 2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGT TCAGACGTGT, 10 Ns represent P7 terminal index Sequence; 2) Combination 2: P5 full-length linker SEQ ID NO: 3 and P7 full-length linker SEQ ID NO: 4, wherein, SEQ ID NO: 3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCC
  • index sequence at the P5 terminal is selected from any one in Table 1-1
  • index sequence at the P7 terminal is selected from any one in Table 1-2.
  • the library building element is an amplification primer composition or an adapter composition
  • the amplification primer composition includes a combination of multiple sets of P5 truncated amplification primers and/or multiple sets of P7 truncated amplification primers, each group of P5 truncated
  • the amplification primers contain any set of 4-base balanced tag sequences selected from Table 1-1
  • each set of P7 truncated amplification primers contains any set of 4-base balanced tag sequences selected from Table 1-2
  • the adapter composition includes multiple sets of P5 full-length adapters and/or multiple P7 full-length adapters, each group of P5 full-length adapters comprising any set of 4-base balanced tag sequences selected from Table 1-1,
  • Each set of P7 full-length adapters contains any set of 4-base balanced tag sequences selected from Table 1-2;
  • 4-base balanced tag sequences refer to a set of 4 tag sequences that are balanced, that is, in the tag sequence There is one each of the bases A, T, G,
  • Figure 1 shows how the Illumina library can be loaded on the MGI sequencing platform through transformation
  • a library can be the compatible library building scheme of dual-platform computer mode
  • Figure 3 shows a compatible on-board amplification scheme for amplification after targeted capture
  • Figure 4 shows the ratio of base deletions in the index region sequence
  • Figure 5 shows the direction and terminal base factors to be considered in the design of the index compatible with the dual sequencing platform
  • Figure 6 shows the 10 bases that should be considered when designing the index of the P5 end adapter, and also consider the C and A base factors before and after;
  • Figure 7 shows the 10 bases that should be considered when designing the index of the P7 end adapter, and also consider the factors of T and G bases before and after;
  • Figure 8 shows that when the P5 end joint design index in the Illumian patent is at the end A, the 3 edit distances become 2 edit distance indexes;
  • Figure 9 shows the 8bp index of the IDT version currently used by Illumian, and there are many only two edit distances
  • Figure 10 shows that the two-color and four-balanced index balances of different sequencers are very different on the machine
  • Figure 11 shows that the IDT version of the product does not consider the law of balance
  • Figure 12 shows the comparison of the library construction output of the two compatible schemes and the Illumina alone scheme
  • Figure 13 shows the library output of 96 groups of library amplification primers in Scheme 1 of the present invention
  • Figure 14 shows the splitting of the sequencing data of the 96 sets of libraries compatible with the dual platforms on the dual platforms
  • Figure 15 shows the lowest and highest base ratios of the four-index balance in 1-12 combinations
  • Figure 16 shows the data splitting comparison between the present invention and the Illumina-recommended 8-group and 12-group mixed-package chips.
  • Double-ended index adapter In high-throughput sequencing, it is necessary to connect the end of each fragment to a common sequencing adapter. Each non-complementary region of the adapter has a variable sequence region sequence, which is the index sequence and is used to split the data during sequencing.
  • the DNA sequence consists of four bases, namely A, T, G, and C.
  • A, T, G, and C In order to read effectively during the sequencing process, a set of tag sequences is combined to ensure that each position of the tag sequence occupies a base than equal.
  • High-throughput sequencing is an important and widely used massively parallel sequencing technology.
  • massively parallel sequencing Massively Parallel Sequencing, hereinafter referred to as "MPS" technology providers include Illumina, MGI and Ion Torrent are three companies. Among them, Illumina and MGI sequencers are more widely used in the market. Many companies and large research institutions have prepared these two sequencers, which have the same sequencing principle and sequencing quality. Therefore, if there is a common library construction scheme that can perform indiscriminate on-machine sequencing on the two sequencing platforms, it will reduce a lot of trouble and make it more convenient for relevant personnel to use.
  • the applicant has developed single-end and paired-end index library construction solutions for the MGI sequencing platform, as well as Illumina library construction solutions.
  • the current Illumina library construction scheme can only be used on the Illumina sequencing platform. If it is installed on the MGI sequencer, it needs to pass the MGI App-a (currently, it is a way that the Illumina library can be installed on the machine, that is, the transformation scheme of App-a. , specifically carry out terminal phosphorylation through 3-5 rounds of PCR amplification, as shown in Figure 1), and because the design of this index does not consider the base balance of the four indexes, it is difficult to sequence on the machine.
  • MGI currently provides a transformation scheme as shown in Figure 1, which performs transformation and amplification operations on the basis of the established Illumina library
  • the redundancy of the MGI sequencing platform is lower than that of the Illumina sequencing platform.
  • the redundancy of the whole exon can be controlled at about 2%.
  • the Illumina sequencing platform NovaSeq 6000 has 20% of the platform's own redundancy. Therefore, after MGI's transformation plan, the advantage of low redundancy of the MGI sequencing platform itself is lost.
  • this application fully considered the problem of two-color and four-color channels, and carried out the absolute balance of four bases in a group of four indexes for the design of the index, so that the library construction linker of this application can be used in both Illumina and MGI While ensuring the quality of index sequencing on the sequencing platform, it is also conducive to the operator to easily arrange sequencing on the machine.
  • One of the schemes of the present application build a library through truncated adapter ligation, and amplify the double-ended index sequence through double-ended index.
  • This scheme is different from the previous Illumina platform library construction scheme in that the 5' end of the primer of the p5-terminal index is phosphorylated, which ensures that the library that will be used for the MGI sequencing platform in the later stage can be carried without affecting the amplification. Phosphorylation during cyclization during construction is shown in the left half of Figure 2.
  • the library constructed in this way can be loaded directly on the Illumina sequencing platform, or directly circularized on the MGI sequencing platform.
  • the linker for building the library is a full-length Y-shaped linker, and the 5' end of the p5 end of the Y-shaped linker is phosphorylated.
  • the advantage of this design is that it can be used for PCR-free library building and can be directly sequenced on the Illumina sequencing platform , and can also be directly circularized and sequenced on the MGI sequencing platform.
  • the full-length adapter can also be amplified with phosphorylated p5 and p7 primers and sequenced directly on the Illumina platform, or directly circularized on the MGI sequencing platform for sequencing.
  • the scheme of the present application can be further extended and applied to the scene of targeted capture sequencing.
  • the library amplification after target capture is amplified with phosphorylated p5 and p7, and the phosphorylated modification only modifies the primer at the p5 end (because sequencing is directional, the circularization of the MGI platform only circularizes one strand, so only the circular
  • the strand phosphorylated at the 5' of the P5 terminal ensures that the amplified library after capture can be sequenced on Illumina, or can be sequenced on the MGI sequencing platform after direct circularization.
  • the P5/P7, or P5 end/P7 end mentioned in this application all refer to the general sequences of P5 and P7 of the illumina sequencing platform.
  • the applicant found that the synthesis quality of the index part had base deletions, and then speculated that there was a certain probability and a certain proportion of base deletions in the process of primer or linker synthesis.
  • Figure 4 by analyzing the data sequenced on the index part, we found that in the desalted and HPLC-purified sequences synthesized by IDT, there are about 0.2%-2.8% single-base deletions, and the deletions can be eliminated after HPLC purification. improved, but not completely eliminated.
  • the first base after the index should also be considered, as shown in Figure 5. Since Illumina All sequencing directions of MGI and MGI are taken into account, therefore, the first base in the forward and backward direction of the index at the P5 end and the index at the P7 end needs to be considered.
  • the index sequence at the P5 end for the sequencers of different models of the Illumina sequencing platform, has both forward and direction sequencing, but for the sequencer of the MGI platform, it is sequenced in the reverse direction. Therefore, At the P5 end, the C base at the front end of the index (ie, the last base of the index for forward sequencing) and the A base at the back end (ie, the last base of the index for reverse sequencing) should be considered. Similarly, as shown in Figure 7, the index sequence at the P7 end is reversely measured on the Illumina sequencing platform, and the front-end base T needs to be considered. The direction of the MGI platform to read the index is forward, and the back-end base needs to be considered g. Due to the need to consider the differential changes before different indexes caused by missing complements and sequencing errors, this issue has not been mentioned in previously published articles and published patent documents.
  • the 10bp index calculates 7198 index sequences with 3 edit distances.
  • the 10bp sequence and the bases close to the index of the linker at the same time only more than 1,000 indexes have a strict edit distance of 3, and we also consider a group of four, considering the P5 and P7 end indexes respectively
  • the index sequences with 3 edit distances for the 8bp index will be correspondingly reduced.
  • the 8bp 384 index sequences designed by IDT officially recommended by the Illumina sequencing platform also have the same problem.
  • the P5 end index 4 (UDP0004) of the IDT version will become the sequence in the middle after a base mutation, and the sequence of index 3 (UDP0003) only needs the first G base to be deleted.
  • the latter A will naturally progress to the front and become the middle sequence. Therefore, one base before and after the index is considered together with the base deletion or mutation to design the index, which has not been mentioned in previously published articles and currently published patents, and the existing public index sequence does not have Strict 3 edit distances. Therefore, in order to screen out enough sequences with a strict edit distance of 3, the paired-end index sequence with a length of 10 bp was selected in this application.
  • the present application also found that: since both the Illumina platform and the MGI sequencing platform have sequencers with 2-color channels and 4-color channels, in order to ensure the sequencing quality of the index, in addition to considering the sequence information of the bases before and after the index, The base balance between indexes needs to be considered.
  • the index sequence on Illumina's products and patents does not consider this issue.
  • we are considering whether to choose two-color balance (that is, two channels) or the absolute balance of four indexes (that is, four channels) will be more conducive to the sequencing on the dual sequencing platform.
  • Both the 2-color channel and 4-color channel sequencers of this platform have good sequencing data quality performance, and the present invention selects the absolute balance of the 4 bases of a group of 4 indexes.
  • the Illumina sequencing platform has not carefully considered this aspect, which also leads to the need to carefully consider the combination of various indexes when sequencing the Illumina sequencing platform.
  • the IDT version of the IDT version recommended by the Illumina sequencing platform has 1-12 index sequences and 4 groups, 8 groups and 12 groups of index base statistics, and there are missing bases in a group of 8, and a group of 12
  • the minimum ratio of bases is 8.3% (for example, C base at position 7 and A base at position 8), which is far lower than the requirement of no less than 12.5% for the MGI sequencer.
  • the absolute balance design of the 4 sets of index bases in this application has a minimum base ratio of more than 14.28% when more than or equal to four samples with the same amount of consecutive indexes are used on the machine, so it can meet the needs of various machine types of the two sequencing platforms.
  • Such an Index design can not only facilitate the arrangement on the machine, but also improve and guarantee the data quality.
  • the 5' end of the amplification primer with truncated p5 end with index is phosphorylated (see SEQ ID NO: 1), and the 5' end of the p5 sequence of the full-length linker is phosphorylated (see SEQ ID NO: 3) and The 5' end of the truncated p5 primer without index is modified with phosphorylation (see SEQ ID NO: 5);
  • the index at the P5 end is considered to be 10 index sequences and the C and A bases before and after it (ie C-10 base index-A); the index at the P7 end is considered to be 10 index sequences and the T and G bases before and after it (T -10 base index-G)
  • the present invention designs a strict balance of 4 index sequences and 10 base positions on the basis of considering the first two points.
  • the present invention has made following improvements on the basis of Illumina:
  • Ns represent the index sequence, which can be any sequence in Table 1-1;
  • Ns represent the index sequence, which can be any sequence in Table 1-2.
  • P5 phosphorylation amplification primer SEQ ID NO: 5 /5Phos/AATGATACGGCGACCACCGAGAT, only difference in phosphorylation from the original Illumina platform;
  • the P7 amplification primers are the same as the original version of Illumina, and no special instructions are made here.
  • the truncated adapters are no different from the original Illumina protocol.
  • this application provides a general solution for library construction.
  • This solution can be sequenced on both Illumina and MGI platforms.
  • the design has considered the 4 bases of the index of the smallest unit combination.
  • the problem of balance, and the strict index can guarantee 3 edit distances no matter forward sequencing or reverse sequencing.
  • the library construction method includes: using primers or adapters with 5' phosphorylation modifications to perform library construction on target samples, and obtaining linear amplified sequences with 5' phosphorylation modifications.
  • the 5' phosphorylated linear amplification library is a linear library suitable for the Illumin sequencing platform; or further circularize the linear amplification library with 5' phosphorylation modification to obtain a linear amplification library suitable for MGI sequencing
  • the circularization library of the platform wherein, the 5' phosphorylation modified primers include P5 truncated amplification primer SEQ ID NO: 1 and P7 truncated amplification primer SEQ ID NO: 2; 5' phosphorylated modified adapters include P5 full Long linker SEQ ID NO: 3 and P7 full-length linker SEQ ID NO: 4;
  • 10 Ns represent the index sequence of P5 terminal;
  • the linear library constructed by primers or adapters with 5' phosphorylation modification on the one hand, can be directly sequenced on the Illumina platform. Since the linear library itself has 5' phosphorylation, it can be directly sequenced. Circularization was performed to prepare a library suitable for sequencing on the MGI platform. This method is easy to construct a library and is compatible with dual sequencing platforms.
  • the index sequence at the P5 end is selected from any one of Table 1-1
  • the index sequence at the P7 end is selected from Any one from Table 1-2. Since the index sequences in Table 1-1 and Table 1-2 fully consider possible errors or deletions during sequence synthesis, at least 3 editing distances are provided, so that the mixed bases can still be correctly split when the synthetic base is missing. Sample sequencing data.
  • the P5 terminal index sequences corresponding to multiple target samples are selected from any set of 4-base balanced tag sequences in Table 1-1, and the P7 terminal index sequences corresponding to multiple target samples
  • the terminal index sequence is selected from any set of 4-base balanced tag sequences in Table 1-2.
  • the 4-base balanced tag sequence refers to a group of 4 tag sequences that are balanced, that is, from the 1st to the 10th position of the tag sequence There is one each of the bases A, T, G, and C at each position of the bit.
  • a set of 4 tag sequences can keep the number of 4 types of bases equal, and achieve a balance of 4 types of bases, thereby ensuring the accuracy of base reading at the same position in each sequence, thereby improving the accuracy of the library.
  • using primers with 5' phosphorylation modifications to carry out library construction on the target sample, and obtaining a linear amplification library with 5' phosphorylation modifications includes: using SEQ ID NO: 7 and SEQ ID NO:
  • the truncated adapter shown in ID NO: 8 is used to connect the fragments derived from the target sample to obtain a fragment with an adapter; the 5' phosphorylation-modified primer pair shown in SEQ ID NO: 1 and SEQ ID NO: 2 is used.
  • the adapter fragment was amplified to obtain a linear amplified library with 5' phosphorylation modification; wherein, SEQ ID NO: 7: ACACTCTTTCCCTACACGACGCTCTTCCGATC*T, * represents sulfur modification; SEQ ID NO: 8: /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC .
  • using 5' phosphorylation-modified linkers to perform library construction on target samples, and obtaining a linear amplification library with 5' phosphorylation modifications includes: using SEQ ID NO: 3 and SEQ ID
  • the full-length adapter shown in NO: 4 is used to connect the fragments from the target sample to obtain a library with adapters; use the library amplification primers shown in SEQ ID NO: 5 and SEQ ID NO: 6 to perform adapter ligation on the library with adapters Amplify to obtain a linear amplification library modified by 5' phosphorylation; wherein, SEQ ID NO: 5: /5Phos/AATGATACGGCGACCACCGAGAT; SEQ ID NO: 6: CAAGCAGAAGACGGCATACGA.
  • the construction steps of the linear library in the above two methods are also applicable to the construction of the capture library. That is, after the above steps, the capture library can be further obtained through targeted capture, and then the library amplification primers modified by 5' phosphorylation can be used to amplify the library to obtain a linear capture library, which is suitable for sequencing on the Illumina platform.
  • the capture library of the MGI platform it can be achieved by targeted capture of the linear amplified library before circularization.
  • the capture library after target capture is amplified by using 5' phosphorylation-modified library amplification primers to obtain a linearly amplified capture library, and the linearly amplified capture library is circularized to obtain A circular library suitable for the MGI sequencing platform; preferably, the 5' phosphorylation-modified library amplification primers include the P5 phosphorylation primer shown in SEQ ID NO: 5, and the P7 primer shown in SEQ ID NO: 6.
  • a library construction kit compatible with dual sequencing platforms
  • the library construction kit includes any one of the following combinations: 1) Combination 1: P5 truncated amplification primer SEQ ID NO: 1 and P7 truncated amplification primer SEQ ID NO: 2, wherein, SEQ ID NO: 1: /5Phos/AATGATACGGCGACCACCGAGATCTACANNNNNNNNACACTCTTTCCCTACACGAC, 10 Ns represent the index sequence of P5 terminal; SEQ ID NO: 2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNNGTGACTGGA GTTCAGACGTGT, 10 N for P7 terminal index sequence; 2) Combination 2: P5 full-length linker SEQ ID NO: 3 and P7 full-length linker SEQ ID NO: 4, wherein, SEQ ID NO: 3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*
  • index sequence at the P5 terminal is selected from any one in Table 1-1
  • index sequence at the P7 terminal is selected from any one in Table 1-2.
  • the library construction kit includes 412 P5-terminal index sequences and 432 P7-terminal index sequences.
  • the P5-terminal index sequences are shown in Table 1-1, and the P7-terminal index sequences are shown in Table 1-2.
  • the index sequence and/or the P7-terminal index sequence are used in conjunction with a set of 4-base balanced tag sequences.
  • a library building element compatible with dual sequencing platforms is provided, the library building element is selected from any one of the following combinations: 1) combination 1: P5 truncated amplification primer SEQ ID NO : 1 and P7 truncated amplification primer SEQ ID NO: 2, wherein, SEQ ID NO: 1: /5Phos/AATGATACGGCGACCACCGAGATCTACANNNNNNNNACACTCTTTCCCTACACGACGAC, 10 N represents the P5 terminal index sequence; SEQ ID NO: 2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGT TCAGACGTGT, 10 Ns represent P7 terminal index Sequence; 2) Combination 2: P5 full-length linker SEQ ID NO: 3 and P7 full-length linker SEQ ID NO: 4, wherein, SEQ ID NO: 3:/5Phos/AATGATACGGCGACCACCGAGATCTACANNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGA
  • index sequence at the P5 terminal is selected from any one in Table 1-1
  • index sequence at the P7 terminal is selected from any one in Table 1-2.
  • the library building element is an amplification primer composition or an adapter composition
  • the amplification primer composition includes a combination of multiple sets of P5 truncated amplification primers and/or multiple sets of P7 truncated amplification primers, each group of P5 truncated
  • the amplification primers contain any set of 4-base balanced tag sequences selected from Table 1-1
  • each set of P7 truncated amplification primers contains any set of 4-base balanced tag sequences selected from Table 1-2
  • the adapter composition includes multiple sets of P5 full-length adapters and/or multiple P7 full-length adapters, each group of P5 full-length adapters comprising any set of 4-base balanced tag sequences selected from Table 1-1,
  • Each set of P7 full-length adapters contains any set of 4-base balanced tag sequences selected from Table 1-2;
  • 4-base balanced tag sequences refer to a set of 4 tag sequences that are balanced, that is, in the tag sequence There is one each of the bases A, T, G,
  • Truncated linker sequence (the truncated linker sequence is consistent with the linker sequence of Illumina single-platform sequencing, both of which are the following sequences SEQ ID NO: 7 and SEQ ID NO: 8):
  • the truncated amplification primers are the above-mentioned SEQ ID NO: 1 and SEQ ID NO: 2, wherein the index sequence of SEQ ID NO: 1 is P5-001 to P5-096 in Table 1-1, and the index of SEQ ID NO: 2 The sequences are P7-001 to P7-096 in Table 1-2.
  • Option 1 The features of Option 1 are:
  • the truncated P5 primer was phosphorylated to be compatible with the MGI sequencing platform.
  • the intermediate index is optimized, taking into account the possible base progression of the synthetic deletion and the 3 edit distances of compatible platform factors.
  • Full-length linker sequence SEQ ID NO: 3+SEQ ID NO: 4, wherein the index sequence of SEQ ID NO: 4 is P5-001 to P5-096 in Table 1-1, the index sequence of SEQ ID NO: 4 They are P7-001 to P7-096 in Table 1-2.
  • P5 phosphorylation amplification primer is SEQ ID NO: 5:/5Phos/AATGATACGGCGACCACCGAGAT;
  • the P7 primer is SEQ ID NO: 6: CAAGCAGAAGACGGCATACGA.
  • the intermediate index is optimized, considering the three edit distances of possible base progression and compatible platform factors for synthetic deletions.
  • the control scheme is to use the products launched by NadPrep TM on the Illumina platform to build a library with 384 kinds of UDI joints of IDT, and use ordinary P5 and P7 primers for amplification.
  • NadPrep TM DNA Library Construction Kit for Illumina
  • Figure 12 shows the output of the library construction sequenced on the Illumina platform of the first and second schemes of the present invention and the control.
  • the first scheme of the present invention is also 50ng input, and only 6 cycles are needed to achieve the second and the control schemes.
  • the output of the 7 loops, in specific applications, the truncated scheme 1 has more advantages in terms of library construction output and compatibility. Compatibility here refers to compatibility with universal truncated adapters, molecular index adapters for plasma applications, truncated methylated molecular index adapters, and amplicon library building.
  • the advantage of the second scheme of the present invention is that it can be used for PCR-free library construction, and the output of the second scheme is equivalent to the output library of the comparison library construction method.
  • Example 2 The present invention's scheme The same library can be sequenced on a dual platform
  • the linker scheme for building the library is carried out according to the first solution of the present invention, and the 100ng DNA standard product (Promage Company) after ultrasonic fragmentation is used as the starting point. Build a library.
  • the first 96 species in Table 1-1 of the P5 end index number and Table 1-2 of the P7 end index number can be the combination of P5-001 and P7-001, the combination of P5-002 and P7-002, and so on, until the combination of P5-096 and P7-096.
  • the combination here is not the only limited combination, and any combination of four groups can be mixed and sequenced on the machine (for example, it can be P5-001 to P5-004 and P7-001 to P7 Any combination of P5 and P7 in -004, and any combination of P5 and P7 in the other four, such as P5-097 to P5-100 and P7-097 to P7-100, and so on).
  • the number of amplification cycles is 5 cycles, and the library output is shown in Figure 13. All library outputs are between 80% and 120% of the mean value, indicating that the amplification efficiency of the first scheme of the present invention is relatively balanced.
  • the library constructed by these 96 sets of primers was sequenced on the Illumina and MGI sequencing platforms for whole-genome (WGS) sequencing on the machine, and the sequencing data was split, and the split data was homogenized.
  • the sequencing data of each library were divided by the mean value of all the data. The final result is shown in FIG. 14 , and the output number was between 75% and 125%, indicating that the scheme of the present invention has consistent performance on both platforms.
  • a set of compatible database building schemes can be used to solve the computer problems on the two platforms.
  • Example 3 The significance and effective splitting of the four groups of index balance in the row machine
  • the proportion of the highest and lowest bases on the four-group index balance 1-12 group arrangement machine of the present invention is shown.
  • the minimum value of the four-group balance of the present invention is 14.3%. Greater than the minimum requirement of 12.5% stipulated by MGI on the machine.
  • the lowest values of the first 8 groups and 12 groups in the IDT version are 0 and 8.3%, as shown in Figure 11.
  • the final splitting result is shown in Figure 16. Since the bases in each position of the 8 groups and 12 groups of the present invention are balanced, the data splitting percentage can reach more than 97%. Illumina recommends the IDT version The data splitting of groups 8 and 12 is not ideal, with more than 30% and more than 80% respectively. Since Hiseq X Ten is a sequencer with four-color fluorescent channels, base imbalance seriously affects the quality of sequencing and the effective splitting of data.
  • the application has designed and optimized a compatible way of building a library and hybridization after capture, the library building and hybridization
  • the capture method can be sequenced on the Illumina platform, or the library can be directly circularized and then sequenced on the MGI sequencing platform.
  • the Index it is also fully considered that there is the first base sequence above and below the index to ensure that three edit distances can be guaranteed when deleting and inserting, so that the data will not be misclassified when splitting.
  • the strict four-balance design avoids the above It is difficult to sort out the machine during machine sequencing, which is conducive to ensuring the quality of sequencing and the effective splitting of data.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

一种兼容双测序平台的建库元件、试剂盒及建库方法。该建库方法包括:采用带有5'磷酸化修饰的引物或接头对目标样本进行文库构建,获得带有5'磷酸化修饰的线性的扩增文库,即为适用于Illumina测序平台的线性文库;或进一步将带有5'磷酸化修饰的线性的扩增文库进行环化,获得适用于MGI测序平台的环化文库。通过在Illumina全长型接头的5'端,或者文库扩增引物的5'端带上磷酸化修饰,便于在获得适合Illumina测序平台的线性文库的同时,只需直接利用5'端的磷酸化修饰进行环化即可获得适用于MGI测序平台的环化文库,从而解决了现有建库方法在两个平台上兼容性低的问题。

Description

兼容双测序平台的建库元件、试剂盒及建库方法 技术领域
本发明涉及高通量测序文库构建领域,具体而言,涉及一种兼容双测序平台的建库元件、试剂盒及建库方法。
背景技术
高通量测序是通过测定核酸序列的方式获得序列信息,目前主流的二代测序仪是Illumina、MGI和life的测序仪,其中Illumina占主流,其次是MGI测序仪,同时这两种测序仪测序原理相同,都是通过边合成边测序方式,通过对合成过程中的单核苷酸标记不同荧光信号的方式来实现对核酸序列的读取。而life的测序仪是通过合成过程中的电子信号释放实现检测的。
由于Illumina和MGI的测序依赖荧光标记的单核苷酸,随着技术的发展Illumina测序仪发展了三种荧光通道和相应的拍照模式:四色荧光四通道模式,三色荧光两通道模式和四色单通道模式。MGI测序仪也有四色荧光四通道模式和三色荧光两通道模式。所以这些测序模式由于都需要检测荧光信号,由于荧光信号相互之间会有重合,所以在检测时需要滤光片尽量避免相互之间的干扰,另一个解决办法就是在混测的时候尽量碱基均衡排机测序,同样道理,Index序列也需要考虑平衡问题。如果Index设计时考虑好了平衡问题,就不用在安排各种通道时费劲心思考虑上机问题。
同时由于Illumina的上机方式时线性扩增的方式进行上机测序,MGI的上机方式是环化后的上机测序,这样如果想把Illumina的文库上机,还需要单独流程处理后才能环化,目前的处理方式是通过PCR扩增的方式处理,虽然这种方式可以解决Illumina文库在MGI测序平台的上机问题,但是增加了操作流程和增加了测序数据的冗余度。
因此,目前市场上出现了需要提供一种简便有效的能够兼容两种测序平台的建库方案的需求。
发明内容
本发明的主要目的在于提供一种兼容双测序平台的建库元件、试剂盒及建库方法,以解决现有技术中的建库方案在两种测序平台上兼容性低的问题。
为了实现上述目的,根据本发明的一个方面,提供了一种兼容双测序平台的建库方法,该建库方法包括:采用带有5’磷酸化修饰的引物或接头对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库,5’磷酸化修饰的线性的扩增文库即为适用于Illumin测序平台的线性文库;或者进一步将带有5’磷酸化修饰的线性的扩增文库进行环化,获得适用于MGI测序平台的环化文库;其中,5’磷酸化修饰的引物包括P5截断型扩增引物SEQ ID NO:1和 P7截断型扩增引物SEQ ID NO:2;5’磷酸化修饰的接头包括P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4;其中,SEQ ID NO:1:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACA CGAC,10个N代表P5端index序列;SEQ ID NO:2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;SEQ ID NO:3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,10个N代表P5端index序列,*代表硫代修饰;SEQ ID NO:4:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;其中,包括P5端index序列或P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
进一步地,P5端index序列选自表1-1中任意一种,P7端index序列选自表1-2中任意一种。
进一步地,目标样本为多个,多个目标样本对应的P5端index序列选自表1-1中任意一组4碱基平衡的标签序列,多个目标样本对应的P7端index序列选自表1-2中任意一组4碱基平衡的标签序列,4碱基平衡的标签序列是指4个一组的标签序列平衡,即在标签序列的第1位到第10位的每个位置上,碱基A、T、G和C各有一个。
进一步地,采用带有5’磷酸化修饰的引物对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库包括:采用SEQ ID NO:7和SEQ ID NO:8所示的截断型接头对来源于目标样本的片段进行接头连接,得到带接头片段;采用SEQ ID NO:1和SEQ ID NO:2所示的5’磷酸化修饰的引物对带接头片段进行扩增,得到带有5’磷酸化修饰的线性的扩增文库;其中,SEQ ID NO:7:ACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰;SEQ ID NO:8:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC。
进一步地,采用5’磷酸化修饰的接头对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库包括:采用SEQ ID NO:3和SEQ ID NO:4所示的全长型接头对来源于目标样本的片段进行接头连接,得到带接头文库;采用SEQ ID NO:5及SEQ ID NO:6所示的文库扩增引物对带接头文库进行扩增,得到5’磷酸化修饰的线性的扩增文库;其中,SEQ ID NO:5:/5Phos/AATGATACGGCGACCACCGAGAT;SEQ ID NO:6:CAAGCAGAAGACGGCATACGA。
进一步地,在进行环化之前,建库方法还包括对线性的扩增文库进行靶向捕获的步骤;优选地,采用5’磷酸化修饰的文库扩增引物对靶向捕获后的捕获文库进行扩增,得到线性扩增捕获文库,对线性扩增捕获文库进行环化,得到适用于MGI测序平台的环化文库;优选地,5’磷酸化修饰的文库扩增引物包括SEQ ID NO:5所示的P5磷酸化引物,以及SEQ ID NO:6所示的P7引物。
根据本申请的第二个方面,提供了一种兼容双测序平台的建库试剂盒,该建库试剂盒包括如下组合中的任意一种:1)组合1:P5截断型扩增引物SEQ ID NO:1和P7截断型扩增 引物SEQ ID NO:2,其中,SEQ ID NO:1:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;SEQ ID NO:2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;2)组合2:P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4,其中,SEQ ID NO:3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,10个N代表P5端index序列,*代表硫代修饰,SEQ ID NO:4:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;其中,包括P5端index序列或P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
进一步地,P5端index序列选自表1-1中任意一种,P7端index序列选自表1-2中任意一种。
进一步地,建库试剂盒包括412条P5端index序列和432条P7端index序列,P5端index序列如表1-1所示,P7端index序列如表1-2所示,其中,P5端index序列和/或P7端index序列均按一组4碱基平衡的标签序列的方式配合使用。
进一步地,建库试剂盒还包括SEQ ID NO:5和SEQ ID NO:6所示的文库扩增引物,和/或者SEQ ID NO:7和8所示的截断型接头。
根据本申请的第三个方面,提供了一种兼容双测序平台的建库元件,该建库元件选自如下组合中的任意一种:1)组合1:P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2,其中,SEQ ID NO:1:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;SEQ ID NO:2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;2)组合2:P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4,其中,SEQ ID NO:3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,10个N代表P5端index序列,*代表硫代修饰,SEQ ID NO:4:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;其中,包括P5端index序列或P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
进一步地,P5端index序列选自表1-1中任意一种,P7端index序列选自表1-2中任意一种。
进一步地,建库元件为扩增引物组合物或者接头组合物,扩增引物组合物包括多组P5截断型扩增引物和/或多组P7截断型扩增引物的组合,每组P5截断型扩增引物包含选自表1-1中的任意一组4碱基平衡的标签序列,每组P7截断型扩增引物包含选自表1-2中的任意一组 4碱基平衡的标签序列;接头组合物包括多组P5全长型接头和/或多种P7全长型接头,每组P5全长型接头包含选自表1-1中的任意一组4碱基平衡的标签序列,每组P7全长型接头包含选自表1-2中的任意一组4碱基平衡的标签序列;4碱基平衡的标签序列是指4个一组的标签序列平衡,即在标签序列的第1位到第10位的每个位置上,碱基A、T、G和C各有一个。
应用本发明的技术方案,通过在illumina全长型接头(P5和P7)的5’端,或者文库扩增引物的5’端带上磷酸化修饰,便于在获得适合Illumina测序平台的线性文库的同时,如果需要用到MGI测序平台,只需直接利用5’端的磷酸化修饰进行环化即可获得适用于MGI测序平台的环化文库。从而解决了现有建库方法在两个平台上兼容性低的问题。
附图说明
构成本申请的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1示出的是Illumina文库通过转化可以在MGI测序平台的上机方式;
图2示出的是一个文库可以双平台上机方式的兼容建库方案;
图3示出的是靶向捕获后扩增的兼容上机扩增方案;
图4示出的是在index区域序列碱基缺失的比例
图5示出的是双测序平台兼容的index在设计时要考虑的方向和末端碱基的因素;
图6示出的是P5端接头设计index时应该考虑的10个碱基还要考虑前后的C和A碱基因素;
图7示出的是P7端接头设计index时应该考虑的10个碱基还要考虑前后的T和G碱基因素;
图8示出的是Illumian专利里面P5端接头设计index时末端A时,3个编辑距离变为2个编辑距离index;
图9示出的是Illumian现在用的IDT版8bp的index也是有很多只有两个编辑距离;
图10示出的是不同测序仪对双色和四平衡index平衡在上机是差异很大;
图11示出的是IDT的版的产品并没有考虑平衡的规律;
图12示出的是两种兼容方案和Illumina单独方案的建库产出对比;
图13示出的是本发明方案一的96组文库扩增引物文库产出;
图14示出的是双平台兼容的96组文库在双平台上测序数据拆分;
图15示出的是四index平衡在1-12种组合的最低和最高碱基占比;
图16示出的是本发明和Illumina推荐的IDT版8组和12组混合包芯片上机的数据拆分对比。
具体实施方式
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将结合实施例来详细说明本发明。
双端标签接头:高通量测序时需要每个片段末端连接通用的测序接头,接头的不互补区域各有一个可变序列区域序列是标签序列,是用来测序时拆分数据用。
四碱基平衡的标签序列:DNA序列有四种碱基组成,即A、T、G和C为了测序过程中的有效读取,组合出一组标签序列保证标签序列的每个位置碱基占比相等。
兼容双测序平台:指建库接头或扩增引物既考虑了Illumina测序平台上机,也考虑了MGI测序平台的上机测序。
高通量测序(NGS)是比较重要的且应用非常广泛的大规模平行测序技术,目前大规模平行测序(Massively Parallel Sequencing,下文简称“MPS”)技术的提供厂商包括因美纳(Illumina)、华大智造(MGI)和Ion Torrent三家,其中,市场上应用较多的是Illumina和MGI测序仪。很多公司和大型研究机构都配制有这两种测序仪,这两种测序仪有相同的测序原理和测序质量。因此,如果有一个共同的建库方案能够在两种测序平台上进行无差别的上机测序,这样便能够减少很多麻烦,更方便相关人员使用。
申请人已经开发了MGI测序平台的单端和双端index的建库解决方案,也有Illumina的建库解决方案。目前的Illumina建库方案只能在Illumina测序平台上上机,如果在MGI测序仪上机需要通过MGI的App-a(目前是Illumina文库可以上机的一种方式,即App-a的转化方案,具体通过3-5轮PCR扩增进行末端磷酸化,如图1所示)的转化,同时由于这个index的设计没有考虑四个index的碱基平衡问题,在上机测序排机有难度。而且,对于既有Illumina的测序仪,也有MGI的测序仪的用户来说,分别用两套建库方案和后期的全套杂交捕获方案也很麻烦。尽管目前MGI公司提供一种如图1所示的转化方案,在建好的Illumina文库基础上进行转化扩增操作,但是这种方案的主要在流程方面有两个问题:(1)增加操作流程,需要建好的文库再增加一个转化过程,浪费时间、人力和物力成本;(2)由于增加了一个转化过程中的扩增(3-5个循环),人为的带来测序过程中的冗余度(Duplications)。本来MGI测序平台的冗余度是比Illumina测序平台低,在MGISEQ-2000测序平台上,按照150×的测序深度测序的话,全外显子的冗余度能够控制在2%左右。在Illumina测序平台NovaSeq 6000有20%的平台自带的冗余度。所以,经过MGI的转化方案,把MGI测序平台本身冗余度低的优势丧失掉了。
此外,如图10所示,由于MGI测序平台和Illumina测序平台都有二色通道和四色通道测序的机器,但在Illumina平台开发的文库和现有公开的针对Illumina平台的专利,都未重视四色通道的碱基平衡,这对准确读取index序列非常不利。
为此,本申请充分考虑了二色和四色通道的问题,对index的设计进行四个index一组的四种碱基的绝对平衡问题,使得本申请的建库接头在Illumina和MGI两种测序平台上都能在保证index测序质量的同时,也有利于操作者容易安排上机测序。
本申请的方案之一:通过截断型的接头连接建库,通过双端index进行扩增带上双端index序列。此方案不同于以往的Illumina平台的建库方案之处在于p5端index的引物的5’端进行了磷酸化修饰,这样保证在不影响扩增的同时能够带上后期用于MGI测序平台的文库构建中环化时的磷酸化,如图2所示左半边所示。这样建成的文库既可以在Illumina测序平台上直接上机,也可以直接环化后在MGI测序平台上上机。
本发明方案之二:建库的接头是全长Y型接头,Y型接头的p5端5’端有磷酸化修饰,这样设计的好处可以做PCR-free建库直接在Illumina测序平台上机测序,也可以直接环化在在MGI测序平台上机上机测序。同时全长型的接头也可以用带磷酸化修饰的p5和p7引物扩增直接在Illumina平台上测序,或者直接环化在MGI测序平台上上机测序。
在上述文库构建的基础上,本申请的方案可以进一步延伸应用到靶向捕获测序的场景中。经过靶向捕获后的文库扩增是用带磷酸化的p5和p7扩增,磷酸化修饰只修饰p5端引物(由于测序是有方向的,MGI平台环化只环化一条链,因此仅环化P5端5’磷酸化的那条链),如图3所示,这样保证捕获后扩增的文库既可以在Illumina上测序,也可以直接环化后在MGI测序平台上测序。
需要说明的是,本申请中所提及的P5/P7,或者P5端/P7端,均是指illumina测序平台的P5和P7通用序列。本申请在实际测序过程中发现index部分合成质量是有碱基缺失现象,进而推测引物或接头合成的过程中是存在一定概率和一定的比例的碱基缺失的。如图4所示,通过分析对index部分进行测序的数据,我们发现在IDT合成的脱盐和经HPLC纯化的序列中,有大概0.2%-2.8%的单碱基缺失存在,缺失经过HPLC纯化可以改善,但是不能根本性消除。同时考虑到在测序的过程中由于index缺失会把后一个碱基递进上来被测序,所以在优化index序列时,index之后的第一个碱基也要考虑,如图5所示,由于Illumina和MGI的所有测序方向都考虑到,因此,在P5端index和P7端的index的前后方向的第一个碱基都需要考虑到。
如图6所示,P5端的index序列,对于Illumina测序平台不同型号的测序仪来说,正向和方向测序的都有,而对于MGI平台的测序仪来说,是反向测序的,因此,P5端要考虑index前端的C碱基(即正向测序的index的后一位碱基)和后端的A碱基(即反向测序的index的后一位碱基)。同样的,如图7所示,P7端的index序列在Illumina测序平台是反向测index序列,需要考虑前端的碱基T,MGI平台读取index的方向是正向,需要考虑的是后端的碱基G。由于需要考虑缺失补位和测序错误导致的不同index之前的差异变化,而这一问题在之前已发表的文章和公开的专利文献中均未提及。
比如,Illumina的专利PCT/US2018059255中,符合三个编辑距离的一些index就会变为只有两个编辑距离,如图8所示(图8中的编辑距离表示两个相同长度字对应位不同的数量,以d(x,y)表示两个字x和y之间的编辑距离,对两个字符串进行异或运算,并统计结果为1的 个数,这个个数就是编辑距离,距离越小,表面相似度越高)在只考虑P5端的一个末尾A碱基时在64个8bp的序列里面就有12个index的编辑距离由三个变为两个。如果再考虑另一个方向,则会有更多的index序列不符合3个编辑距离的规则。在设计index时要保证至少有三个编辑距离,才能保证在分析时能够在一个错误存在时,依靠另外两个的差异仍能找到正确的数据拆分结果。如果只有两个编辑距离,那任何环节错一个,就无法正确拆分出数据。
同样地,现有文章发表的双端index(比如,PMID:23793624),10bp的index计算出有7198种有3个编辑距离的index序列。我们考虑到10bp的序列同时再加上接头的靠近index的首尾碱基,只有1000多种index具有严格的3个编辑距离,并且还要考虑到四个一组,分别考虑P5和P7端index上下游碱基,适合的各有412种和432种,见表1-1和表1-2。
同样道理8bp的index具备3个编辑距离的index序列也会相应的减少,比如现在Illumina测序平台官方推荐的IDT设计的8bp的384种index序列也具有同样的问题。如图9所示,IDT版的P5端index 4(即UDP0004)经过一个碱基的突变就会变为中间的那条序列,index 3(即UDP0003)的序列只要第一个G碱基缺失,在测序时后一个A就自然递进到前面变为中间的序列。所以将index前后各一个碱基与碱基缺失或突变考虑在一起来进行index的设计,在以前发表的文章和目前已公开的专利上并未提及,且现有公开的index序列并不具备严格的3个编辑距离。因此,为了能够筛选出足够多的具有严格的3个编辑距离的序列,本申请中选择的是10bp的长度的双端index序列。
本申请还发现:在上机时由于Illumina平台和MGI测序平台都有2色通道和4色通道的测序仪,为了保证index的测序质量,除了要考虑index的前后碱基的序列信息外,还需要考虑index之间的碱基平衡问题。在Illumina的产品和专利上的index序列都没有考虑这个问题。如图10所示,我们在考虑是选择双色平衡(即两通道)还是4个index的绝对平衡(即4通道)会更有利于双测序平台的上机测序时,为了能够在Illumina和MGI两种平台的2色通道和4色通道的测序仪上都有好的测序数据质量表现,本发明选择了4个一组的index的4种碱基的绝对平衡。
在已申请的专利和已售卖的产品中可以看出,Illumina测序平台在这方面都没有仔细考虑,这也导致在Illumina测序平台上机排机时,需要慎重考虑各种index之间的组合。如图11所示,Illumina测序平台推荐的IDT版的P5端1-12index序列和4组,8组和12组index碱基统计,在8个一组还有缺失的碱基,12个一组的最低碱基比例是8.3%(比如,第7位的C碱基和第8位的A碱基),这个比例远低于MGI测序仪要求的不低于12.5%的要求。而本申请的4组index碱基绝对平衡的设计,在大于等于四种样本等量连续的index上机时最低碱基比例都是大于14.28%的,因而能够满足两种测序平台的各种机型的要求,这样的Index设计既能方便上机安排,又能提高和保证数据质量。
需要说明的是,本申请的核心改进点有如下特征:
1.截断p5端带index的扩增引物的5’端带磷酸化修饰(见SEQ ID NO:1)、全长接头的p5序列5’端带磷酸化修饰(见SEQ ID NO:3)和不带index的截断p5端引物的5’端带磷酸化修饰(见SEQ ID NO:5);
2.在设计双端index时,除了考虑index序列本身的差异,还考虑了由于合成缺失,测序错误导致index前后各一个碱基的递进导致的编辑距离减少,因而是考虑了满足严格3个编辑距离的设计要求。P5端的index考虑是10个index序列和其前后的C和A碱基(即C-10碱基index-A);P7端的index考虑是10个index序列和其前后的T和G碱基(T-10碱基index-G)
3.考虑在双测序平台上的各种机型上机的方便,本发明在考虑前两点的基础设计了严格4个index序列10个碱基位置上的严格平衡。
本发明在Illumina的基础上做了如果下改进:
P5截断型扩增引物SEQ ID NO:1:
/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,其中,10个N代表index序列,具体可以是表1-1中的任一条序列;
P7截断型扩增引物SEQ ID NO:2:
CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,其中,10个N代表index序列,具体可以是表1-2的中的任一条序列。
P5全长型接头SEQ ID NO:3:
/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,其中,*代表硫代修饰,10个N代表index序列,具体可以是表1-1中的任一条序列;
P7全长型接头SEQ ID NO:4:
/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,其中,10个N代表index序列,具体可以是表1-2的中的任一条序列。
P5磷酸化扩增引物SEQ ID NO:5:/5Phos/AATGATACGGCGACCACCGAGAT,与原Illumina平台仅有磷酸化的区别;
P7扩增引物与Illumina原版相同,这里不做特殊说明。此外,截断接头也和Illumina原方案没有区别。
以下基于Illumina测序平台接头序列优化的10bp的4平衡index序列:
经过严格的3个编辑距离的筛选和4个index的平衡性,p5端10bp长度一共筛选到412条序列,序列如表1-1所示;p7端10bp长度一共筛选到432条序列,序列如表1-2所示。
表1-1.P5端index序列
Figure PCTCN2021131508-appb-000001
Figure PCTCN2021131508-appb-000002
Figure PCTCN2021131508-appb-000003
表1-2.P7端index序列
Figure PCTCN2021131508-appb-000004
Figure PCTCN2021131508-appb-000005
Figure PCTCN2021131508-appb-000006
综上所述,本申请是提供了一种通用的建库解决方案,此方案在Illumina和MGI双平台上各机型都能上机测序,设计时已经考虑最小单位组合的index的4碱基平衡问题,和严格的index之间无论正向测序还是反向测序都能够保证3个编辑距离。
上述改进有两方面的有益效果:第一,可以是更好的适应Illumina各型号测序仪,真正做到每个index之间有3个编辑距离的差异。第二,这些改进可以直接环化后在MGI测序仪上测序。
基于上述研究结果,申请人提出了本申请所保护的技术方案。提供了一种兼容双测序平台的建库方法,该建库方法包括:采用带有5’磷酸化修饰的引物或接头对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库,5’磷酸化修饰的线性的扩增文库即为适用于Illumin测序平台的线性文库;或者进一步将带有5’磷酸化修饰的线性的扩增文库进行环化,获得适用于MGI测序平台的环化文库;其中,5’磷酸化修饰的引物包括P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2;5’磷酸化修饰的接头包括P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4;其中,SEQ ID NO:1:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACA CGAC,10个N代表P5端index序列;SEQ ID NO:2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;SEQ ID NO:3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰,10个N代表P5端index序列,SEQ ID NO:4:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;其中,包括P5端index序列或P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
上述改进方案,通过带有5’磷酸化修饰的引物或接头构建而成的线性文库,一方面可以直接在Illumina平台上上机测序,由于该线性文库本身带有5’磷酸化,因而可以直接进行环化制备成适合MGI平台上机测序的文库。该方法建库简便,且兼容双测序平台。
为进一步提高测序碱基的质量和混样测序时,数据拆分的准确性,在一种优选的实施例中,P5端index序列选自表1-1中任意一种,P7端index序列选自表1-2中任意一种。由于表1-1和表1-2中的index序列充分考虑了序列合成时可能的错误或缺失等问题,提供了至少3个编辑距离,因而能够在合成碱基缺失时仍能正确拆分混样的测序数据。
在另一优选的实施例中,目标样本为多个,多个目标样本对应的P5端index序列选自表1-1中任意一组4碱基平衡的标签序列,多个目标样本对应的P7端index序列选自表1-2中任意一组4碱基平衡的标签序列,4碱基平衡的标签序列是指4个一组的标签序列平衡,即在标签序列的第1位到第10位的每个位置上,碱基A、T、G和C各有一个。当多个目标样本进行混样测序时,考虑到不同样本同一位置上(比如,都是第3位上的)index碱基读取的准确性,采用本申请表1-1和1-2优选的4个一组的标签序列,能够保持4种碱基类型的数量均等,实现4种碱基保持平衡,进而保证了每条序列同一位置上的碱基读取的准确性,从而提高文库正确拆分的比率。
根据上述带有5’磷酸化修饰的是截断型引物还是全长型接头,建库的具体流程稍有不同。在一种优选的实施例中,采用带有5’磷酸化修饰的引物对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库包括:采用SEQ ID NO:7和SEQ ID NO:8所示的截断型接头对来源于目标样本的片段进行接头连接,得到带接头片段;采用SEQ ID NO:1和SEQ ID NO:2所示的5’磷酸化修饰的引物对带接头片段进行扩增,得到带有5’磷酸化修饰的线性的扩增文库;其中,SEQ ID NO:7:ACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰,;SEQ ID NO:8:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC。
在另一种优选的实施例中,采用5’磷酸化修饰的接头对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库包括:采用SEQ ID NO:3和SEQ ID NO:4所示的全长型接头对来源于目标样本的片段进行接头连接,得到带接头文库;采用SEQ ID NO:5及SEQ ID NO:6所示的文库扩增引物对带接头文库进行扩增,得到5’磷酸化修饰的线性的扩增文库;其中,SEQ ID NO:5:/5Phos/AATGATACGGCGACCACCGAGAT;SEQ ID NO:6:CAAGCAGAAGACGGCATACGA。
上述两种方式线性文库的构建步骤,也同样适用于捕获文库的构建中。即上述步骤后,可以进一步通过靶向捕获,获得捕获文库,进而采用5’磷酸化修饰的文库扩增引物进行文库扩增,即可获得线性的捕获文库,适用于Illumina平台上机测序。
而对于MGI平台的捕获文库来说,可以在进行环化之前,对线性的扩增文库进行靶向捕获来实现。在一种优选的实施例中,采用5’磷酸化修饰的文库扩增引物对靶向捕获后的捕获文库进行扩增,得到线性扩增捕获文库,对线性扩增捕获文库进行环化,得到适用于MGI测序平台的环化文库;优选地,5’磷酸化修饰的文库扩增引物包括SEQ ID NO:5所示的P5磷酸化引物,以及SEQ ID NO:6所示的P7引物。
根据本申请的第二个方面,提供了一种兼容双测序平台的建库试剂盒,该建库试剂盒包括如下组合中的任意一种:1)组合1:P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2,其中,SEQ ID NO:1:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;SEQ ID NO:2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;2)组合2:P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4,其中,SEQ ID NO:3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰,10个N代表P5端index序列,SEQ ID NO:4:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;其中,包括P5端index序列或P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
进一步地,P5端index序列选自表1-1中任意一种,P7端index序列选自表1-2中任意一种。
进一步地,建库试剂盒包括412条P5端index序列和432条P7端index序列,P5端index序列如表1-1所示,P7端index序列如表1-2所示,其中,P5端index序列和/或P7端index序列均按一组4碱基平衡的标签序列的方式配合使用。
根据本申请的第三个方面,提供了一种兼容双测序平台的建库元件,该建库元件选自如下组合中的任意一种:1)组合1:P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2,其中,SEQ ID NO:1:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;SEQ ID NO:2CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;2)组合2:P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4,其中,SEQ ID NO:3:/5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰,10个N代表P5端index序列,SEQ ID NO:4: /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;其中,包括P5端index序列或P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
进一步地,P5端index序列选自表1-1中任意一种,P7端index序列选自表1-2中任意一种。
进一步地,建库元件为扩增引物组合物或者接头组合物,扩增引物组合物包括多组P5截断型扩增引物和/或多组P7截断型扩增引物的组合,每组P5截断型扩增引物包含选自表1-1中的任意一组4碱基平衡的标签序列,每组P7截断型扩增引物包含选自表1-2中的任意一组4碱基平衡的标签序列;接头组合物包括多组P5全长型接头和/或多种P7全长型接头,每组P5全长型接头包含选自表1-1中的任意一组4碱基平衡的标签序列,每组P7全长型接头包含选自表1-2中的任意一组4碱基平衡的标签序列;4碱基平衡的标签序列是指4个一组的标签序列平衡,即在标签序列的第1位到第10位的每个位置上,碱基A、T、G和C各有一个。
下面将结合具体的实施例来进一步说明本申请的有益效果。
需要说明的是,以下实施例采用NadPrep TM DNA文库构建试剂盒(for Illumina)使用说明书V3.4(纳昂达(南京)生物科技有限公司)所提供的文库构建流程进行。具体流程简述如下:
DNA样本片段化---末端修复和加A---接头连接---片段筛选---PCR扩增---文库纯化、定量和质检---使用Illumina/MGI平台测序或靶向捕获后测序。
还需要说明的是,以下实施例仅是示例性说明,并不限定本申请的方法仅能采用如下方法。
实施例1本发明的建库方案一与方案二方案与现有技术中单独Illumia平台建库比较
步骤:
建库步骤参考NadPrep TM DNA文库构建试剂盒(for Illumina)(202105Version3.4)说明书,唯一的不同之处在于接头和扩增引物的差异,具体如下:
(1)本发明方案一:
截断型接头序列(截断型接头序列与Illumina单平台测序的接头序列一致,均为如下序列SEQ ID NO:7和SEQ ID NO:8):
ACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰,(SEQ ID NO:7)
/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC(SEQ ID NO:8)
截断型扩增引物为上述SEQ ID NO:1和SEQ ID NO:2,其中SEQ ID NO:1的index序列为表1-1中的P5-001至P5-096,SEQ ID NO:2的index序列为表1-2中的P7-001至P7-096。
方案一的特点是:
1)截断型P5引物进行了磷酸化修饰,目的是兼容MGI测序平台。
2)中间index进行了优化,考虑合成缺失可能的碱基递进和兼容平台因素的3个编辑距离。
3)绝对的4个index的10个碱基位置的平衡,有利于在双色和四色的测序平台安排上机。
(2)本发明方案二:
全长型接头序列:SEQ ID NO:3+SEQ ID NO:4,其中SEQ ID NO:4的index序列为表1-1中的P5-001至P5-096,SEQ ID NO:4的index序列为表1-2中的P7-001至P7-096。
P5磷酸化扩增引物为SEQ ID NO:5:/5Phos/AATGATACGGCGACCACCGAGAT;
P7引物为SEQ ID NO:6:CAAGCAGAAGACGGCATACGA。
方案二的特点是:
1)全长型P5接头5’端进行了磷酸化修饰,后续扩增的引物P5端进行了5’磷酸化修饰,目的是兼容MGI测序平台;
2)全长型建库,可以进行PCR-free建库;
3)中间index进行了优化,考虑合成缺失可能的碱基递进和兼容平台因素的3个编辑距离。
4)绝对的4个index的10个碱基位置的平衡,有利于在双色和四色的测序平台安排上机。
(3)对照方案
对照方案就是用目前纳昂达对Illumina平台推出的产品配合IDT的384种UDI接头建库,扩增用普通的P5和P7引物,具体流程参考NadPrep TM DNA文库构建试剂盒(for Illumina)(202105Version3.4)说明书。
表2:三种建库方案产出比较
方案 DNA投入量 扩增循环数
方案一 50ng 6
方案二 50ng 7
对照 50ng 7
本发明方案一和方案二与对照在Illumina平台上测序的文库建库产出如图12所示,本发明方案一在同样是50ng的投入量,只需要6个循环即可以达到方案二和对照的7个循环产出,在具体应用时,截断型方案一在建库产出和兼容性方面更有优势。此处的兼容性是指可以兼容通用截断型接头、血浆应用的分子标签接头、截断型甲基化分子标签接头和扩增子建库。本发明方案二的好处是可以做PCR-free建库,方案二的产出与对照建库方法的产出的文库相当。
实施例2本发明方案同一个文库可以在双平台上上机测序
步骤:
建库步骤参考NadPrep TM DNA文库构建试剂盒(for Illumina)(202105Version3.4)说明书,建库的方案接头方案按本发明方案一进行,超声打断后的100ng DNA标准品(Promage公司)起始建库。用本发明方案一的截断接头和SEQ ID NO:1和SEQ ID NO:2扩增引物进行扩增,P5端index编号的表1-1的前96种和P7端index标号的表1-2的前96种进行对应组合,比如,可以是P5-001和P7-001组合,P5-002和P7-002组合,依次类推,直至P5-096和P7-096组合。但需要说明的是,此处的组合并非是唯一限定的组合方式,四个一组的任意组合均可以进行混合上机测序(比如,可以是P5-001至P5-004与P7-001至P7-004中任意的P5与P7的组合,与另外四个,比如P5-097至P5-100与P7-097至P7-100中的任意的P5与P7的组合,依次类推)。
扩增循环数是5个循环,文库产出如图13所示,所有的文库产出都在均值的上下80%至120%之间,说明本发明方案一的扩增效率比较均衡。
同时把这96组引物构建的文库分别在Illumina和MGI测序平台上进行等比混合的全基因组(WGS)上机测序,测序的数据进行数据拆分,拆分后的数据进行均一化处理,每个文库的测序数据除以所有数据的均值,最终的结果如图14所示,产出的数在75%和125%之间,说明本发明方案在两个平台上均有一致的表现。可以用一套兼容的建库方案解决在两个平台的上机问题。
实施例3四组index平衡在排机的意义和有效拆分
步骤:
建库步骤参考NadPrep TM DNA文库构建试剂盒(for Illumina)(202105Version3.4)说明书,分别用本发明的8组、12组引物建库上机测序和8组、12组Illumina平台的官方推荐的IDT版本的建库在Hiseq X Ten单独包芯片通道(lane)上机分析拆分情况。
如图15所示的是本发明的四组index平衡1-12组排机上机的最高和最低碱基占比,本发明四平衡在大于4个及以上的组合中,最低值是14.3%,大于MGI上机规定的12.5%的最低要求。在IDT版本的前8组和12组最低值是0和8.3%,如图11所示。
最终的拆分结果,如图16所示,本发明的由于8组和12组在每个位置的碱基都是平衡的,因而数据拆分百分比都能够达到97%以上,Illumina推荐IDT版的8组和12组数据拆分均不理想,分别是30%多和80%多,由于Hiseq X Ten是四色荧光通道的测序仪,碱基不平衡严重影响测序质量和数据有效拆分。
从以上的描述中,可以看出,本发明上述的实施例实现了如下技术效果:本申请通过设计和优化了一种兼容性的建库和杂交捕获后的建库方式,本建库和杂交捕获方式可以实现即在Illumina平台上机测序,也可以文库直接环化后在MGI测序平台上上机测序。在设计Index时也充分考虑了index上下有第一个碱基序列,保证缺失和插入时都能保证三个编辑距离,使得拆分数据时不至于错分,同时严格的四平衡设计,避免上机测序时的排机困难,有利于保证测序质量和数据的有效拆分问题。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (13)

  1. 一种兼容双测序平台的建库方法,其特征在于,所述建库方法包括:
    采用带有5’磷酸化修饰的引物或5’磷酸化修饰的接头对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库,所述5’磷酸化修饰的线性的扩增文库即为适用于Illumin测序平台的线性文库;或者
    进一步将所述带有5’磷酸化修饰的线性的扩增文库进行环化,获得适用于MGI测序平台的环化文库;
    其中,所述5’磷酸化修饰的引物包括P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2;所述5’磷酸化修饰的接头包括P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4;
    其中,SEQ ID NO:1:
    /5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;
    SEQ ID NO:2:
    CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;
    SEQ ID NO:3:
    /5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,10个N代表P5端index序列,*代表硫代修饰;
    SEQ ID NO:4:
    /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;
    其中,包括所述P5端index序列或所述P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
  2. 根据权利要求1所述的建库方法,其特征在于,所述P5端index序列选自表1-1中任意一种,所述P7端index序列选自表1-2中任意一种。
  3. 根据权利要求2所述的建库方法,其特征在于,所述目标样本为多个,多个所述目标样本对应的所述P5端index序列选自表1-1中任意一组4碱基平衡的标签序列,多个所述目标样本对应的所述P7端index序列选自表1-2中任意一组4碱基平衡的标签序列,所述4碱基平衡的标签序列是指4个一组的标签序列平衡,即在标签序列的第1位到第10位的每个位置上,碱基A、T、G和C各有一个。
  4. 根据权利要求3所述的建库方法,其特征在于,采用带有5’磷酸化修饰的引物对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库包括:
    采用SEQ ID NO:7和SEQ ID NO:8所示的截断型接头对来源于所述目标样本的片段进行接头连接,得到带接头片段;
    采用SEQ ID NO:1和SEQ ID NO:2所示的所述5’磷酸化修饰的引物对所述带接头片段进行扩增,得到所述带有5’磷酸化修饰的线性的扩增文库;
    其中,SEQ ID NO:7:ACACTCTTTCCCTACACGACGCTCTTCCGATC*T,*代表硫代修饰;
    SEQ ID NO:8:/5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC。
  5. 根据权利要求3所述的建库方法,其特征在于,采用5’磷酸化修饰的接头对目标样本进行文库构建,获得带有5’磷酸化修饰的线性的扩增文库包括:
    采用SEQ ID NO:3和SEQ ID NO:4所示的全长型接头对来源于所述目标样本的片段进行接头连接,得到带接头文库;
    采用SEQ ID NO:5及SEQ ID NO:6所示的文库扩增引物对所述带接头文库进行扩增,得到所述5’磷酸化修饰的线性的扩增文库;
    其中,SEQ ID NO:5:/5Phos/AATGATACGGCGACCACCGAGAT;
    SEQ ID NO:6:CAAGCAGAAGACGGCATACGA。
  6. 根据权利要求1至5中任一项所述的建库方法,其特征在于,在进行环化之前,所述建库方法还包括对所述线性的扩增文库进行靶向捕获的步骤;
    优选地,采用5’磷酸化修饰的文库扩增引物对靶向捕获后的捕获文库进行扩增,得到线性扩增捕获文库,
    对所述线性扩增捕获文库进行所述环化,得到所述适用于MGI测序平台的环化文库;
    优选地,所述5’磷酸化修饰的文库扩增引物包括SEQ ID NO:5所示的P5磷酸化引物,以及SEQ ID NO:6所示的P7引物。
  7. 一种兼容双测序平台的建库试剂盒,其特征在于,所述建库试剂盒包括如下组合中的任意一种:
    1)组合1:P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2,其中,SEQ ID NO:1:
    /5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;
    SEQ ID NO:2
    CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACG TGT,10个N代表P7端index序列;
    2)组合2:P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4,其中,
    SEQ ID NO:3:
    /5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,10个N代表P5端index序列,*代表硫代修饰;
    SEQ ID NO:4:
    /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;
    其中,包括所述P5端index序列或所述P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
  8. 根据权利要求7所述的建库试剂盒,其特征在于,所述P5端index序列选自表1-1中任意一种,所述P7端index序列选自表1-2中任意一种。
  9. 根据权利要求7所述的建库试剂盒,其特征在于,所述建库试剂盒包括412条P5端index序列和432条P7端index序列,所述P5端index序列如表1-1所示,所述P7端index序列如表1-2所示,
    其中,所述P5端index序列和/或所述P7端index序列均按一组4碱基平衡的标签序列的方式配合使用。
  10. 根据权利要求7至9中任一项所述的建库试剂盒,其特征在于,所述建库试剂盒还包括SEQ ID NO:5和SEQ ID NO:6所示的文库扩增引物,和/或者SEQ ID NO:7和8所示的截断型接头。
  11. 一种兼容双测序平台的建库元件,其特征在于,所述建库元件选自如下组合中的任意一种:
    1)组合1:P5截断型扩增引物SEQ ID NO:1和P7截断型扩增引物SEQ ID NO:2,其中,SEQ ID NO:1:
    /5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGAC,10个N代表P5端index序列;
    SEQ ID NO:2
    CAAGCAGAAGACGGCATACGAGATNNNNNNNNNNGTGACTGGAGTTCAGACGTGT,10个N代表P7端index序列;
    2)组合2:P5全长型接头SEQ ID NO:3和P7全长型接头SEQ ID NO:4,其中,
    SEQ ID NO:3:
    /5Phos/AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATC*T,10个N代表P5端index序列,*代表硫代修饰,
    SEQ ID NO:4:
    /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG,10个N代表P7端index序列;
    其中,包括所述P5端index序列或所述P7端index序列在内的index序列上下游各1bp的序列至少含有三个编辑距离。
  12. 根据权利要求11所述的建库元件,其特征在于,所述P5端index序列选自表1-1中任意一种,所述P7端index序列选自表1-2中任意一种。
  13. 根据权利要求11所述的建库元件,其特征在于,所述建库元件为扩增引物组合物或者接头组合物,
    所述扩增引物组合物包括多组P5截断型扩增引物和/或多组P7截断型扩增引物的组合,每组所述P5截断型扩增引物包含选自表1-1中的任意一组4碱基平衡的标签序列,每组所述P7截断型扩增引物包含选自表1-2中的任意一组4碱基平衡的标签序列;
    所述接头组合物包括多组P5全长型接头和/或多种P7全长型接头,每组所述P5全长型接头包含选自表1-1中的任意一组4碱基平衡的标签序列,每组所述P7全长型接头包含选自表1-2中的任意一组4碱基平衡的标签序列;
    所述4碱基平衡的标签序列是指4个一组的标签序列平衡,即在标签序列的第1位到第10位的每个位置上,碱基A、T、G和C各有一个。
PCT/CN2021/131508 2021-11-09 2021-11-18 兼容双测序平台的建库元件、试剂盒及建库方法 Ceased WO2023082305A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21951111.0A EP4202058A4 (en) 2021-11-09 2021-11-18 BANK BUILDING ELEMENT COMPATIBLE WITH DOUBLE SEQUENCING PLATFORMS, BANK BUILDING KIT AND METHOD
US18/016,857 US20240279647A1 (en) 2021-11-09 2021-11-18 Element, Kit and Library Construction Method Compatible With Double Sequencing Platforms

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111322946.6A CN113999893B (zh) 2021-11-09 2021-11-09 兼容双测序平台的建库元件、试剂盒及建库方法
CN202111322946.6 2021-11-09

Publications (1)

Publication Number Publication Date
WO2023082305A1 true WO2023082305A1 (zh) 2023-05-19

Family

ID=79928434

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131508 Ceased WO2023082305A1 (zh) 2021-11-09 2021-11-18 兼容双测序平台的建库元件、试剂盒及建库方法

Country Status (4)

Country Link
US (1) US20240279647A1 (zh)
EP (1) EP4202058A4 (zh)
CN (1) CN113999893B (zh)
WO (1) WO2023082305A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117551645A (zh) * 2023-10-12 2024-02-13 江苏先声医学诊断有限公司 一种将Illumina文库快速转化为MGI测序平台文库的试剂及方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116004763B (zh) * 2022-07-19 2024-02-09 纳昂达(南京)生物科技有限公司 一种组合型接头的选择验证和质控方法
CN116024308A (zh) * 2022-12-02 2023-04-28 杭州布平医学检验实验室有限公司 肺癌相关基因高通量扩增子文库的制备方法、多重pcr引物对及应用

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070031865A1 (en) * 2005-07-07 2007-02-08 David Willoughby Novel Process for Construction of a DNA Library
CN109706219A (zh) * 2018-12-20 2019-05-03 臻和(北京)科技有限公司 构建测序文库的方法、试剂盒、上机方法及测序数据的拆分方法
CN110114472A (zh) * 2016-12-21 2019-08-09 深圳华大智造科技有限公司 将线性测序文库转换为环状测序文库的方法
CN111118001A (zh) * 2019-12-31 2020-05-08 苏州贝康医疗器械有限公司 一种多测序平台通用接头、适用于多测序平台的文库构建方法及试剂盒
CN111748551A (zh) * 2019-03-27 2020-10-09 纳昂达(南京)生物科技有限公司 封闭序列、捕获试剂盒、文库杂交捕获方法及建库方法
CN111910258A (zh) * 2020-08-19 2020-11-10 纳昂达(南京)生物科技有限公司 双端文库标签组合物及其在mgi测序平台中的应用
CN112626189A (zh) * 2020-04-24 2021-04-09 北京吉因加医学检验实验室有限公司 基因测序仪的短接头、双index接头引物和双index建库体系
CN113005121A (zh) * 2021-04-25 2021-06-22 纳昂达(南京)生物科技有限公司 接头元件、试剂盒及其相关应用

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018053070A1 (en) * 2016-09-14 2018-03-22 Intellia Therapeutics, Inc. Improved methods for analyzing edited dna
EP3532635B1 (en) * 2016-10-31 2021-06-09 F. Hoffmann-La Roche AG Barcoded circular library construction for identification of chimeric products
EP3382034A1 (en) * 2017-03-31 2018-10-03 Rheinische Friedrich-Wilhelms-Universität Bonn Gene expression analysis by means of generating a circularized single stranded cdna library
AU2019320771B2 (en) * 2018-08-15 2025-12-18 Illumina Cambridge Limited Compositions and methods for improving library enrichment
CN114829623A (zh) * 2019-07-22 2022-07-29 艾格诺姆克斯国际基因组学公司 用于使用双独特双索引的高通量样品制备的方法和组合物
CN110511978A (zh) * 2019-09-09 2019-11-29 北京优迅医学检验实验室有限公司 Ffpe样本dna文库及其构建方法
JP2023521687A (ja) * 2020-04-07 2023-05-25 パーソナル ゲノム ダイアグノスティクス インコーポレイテッド 浮動バーコード
WO2021232023A2 (en) * 2020-05-15 2021-11-18 Swift Biosciences, Inc. Methods for ligation-coupled-pcr
CN112941147B (zh) * 2021-03-02 2024-06-04 深圳市睿法生物科技有限公司 一种高保真靶标基因建库方法及其试剂盒

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070031865A1 (en) * 2005-07-07 2007-02-08 David Willoughby Novel Process for Construction of a DNA Library
CN110114472A (zh) * 2016-12-21 2019-08-09 深圳华大智造科技有限公司 将线性测序文库转换为环状测序文库的方法
CN109706219A (zh) * 2018-12-20 2019-05-03 臻和(北京)科技有限公司 构建测序文库的方法、试剂盒、上机方法及测序数据的拆分方法
CN111748551A (zh) * 2019-03-27 2020-10-09 纳昂达(南京)生物科技有限公司 封闭序列、捕获试剂盒、文库杂交捕获方法及建库方法
CN111118001A (zh) * 2019-12-31 2020-05-08 苏州贝康医疗器械有限公司 一种多测序平台通用接头、适用于多测序平台的文库构建方法及试剂盒
CN112626189A (zh) * 2020-04-24 2021-04-09 北京吉因加医学检验实验室有限公司 基因测序仪的短接头、双index接头引物和双index建库体系
CN111910258A (zh) * 2020-08-19 2020-11-10 纳昂达(南京)生物科技有限公司 双端文库标签组合物及其在mgi测序平台中的应用
CN113005121A (zh) * 2021-04-25 2021-06-22 纳昂达(南京)生物科技有限公司 接头元件、试剂盒及其相关应用

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4202058A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117551645A (zh) * 2023-10-12 2024-02-13 江苏先声医学诊断有限公司 一种将Illumina文库快速转化为MGI测序平台文库的试剂及方法

Also Published As

Publication number Publication date
EP4202058A4 (en) 2024-05-01
EP4202058A1 (en) 2023-06-28
CN113999893A (zh) 2022-02-01
CN113999893B (zh) 2022-11-01
US20240279647A1 (en) 2024-08-22

Similar Documents

Publication Publication Date Title
WO2023082305A1 (zh) 兼容双测序平台的建库元件、试剂盒及建库方法
CN113005121B (zh) 接头元件、试剂盒及其相关应用
CN101967476B (zh) 一种基于接头连接的DNA PCR-Free标签文库构建方法
Huang et al. Palindromic sequence impedes sequencing-by-ligation mechanism
CN102181533B (zh) 多样本混合测序方法及试剂盒
CN102653784B (zh) 用于多重核酸测序的标签及其使用方法
CN111910258B (zh) 双端文库标签组合物及其在mgi测序平台中的应用
CN114891859B (zh) 一种液相杂交捕获方法及其试剂盒
CN114277096B (zh) 鉴别地中海贫血αααanti4.2杂合型和HKαα杂合型的方法和试剂盒
CN114657254B (zh) 用于bcr/tcr基因重排检测的试剂盒和装置
CN111748551A (zh) 封闭序列、捕获试剂盒、文库杂交捕获方法及建库方法
CN106676099B (zh) 构建简化基因组文库的方法及试剂盒
Maestri et al. ‘Nebbiolo’genome assembly allows surveying the occurrence and functional implications of genomic structural variations in grapevines (Vitis vinifera L.)
WO2023221307A1 (zh) 一种靶向富集核酸的探针
Jelinek et al. Digital restriction enzyme analysis of methylation (DREAM)
WO2021203461A1 (zh) 一种用于纳米孔测序建库的位置锚定条码系统
CN111005075A (zh) 用于双样本共建测序文库的y型接头和双样本共建测序文库的方法
JPWO2022036977A5 (ja) ペアエンドライブラリータグ組成物及びそれのmgiシーケンシングプラットフォームにおける使用
CN114807302B (zh) 扩增子文库构建方法及用于地中海贫血突变型与缺失型基因检测的试剂盒
HK40058734B (zh) 兼容双测序平台的建库元件、试剂盒及建库方法
CN113584135B (zh) 一种混样检测rna修饰并实现精准定量的方法
CN114807125A (zh) 测序文库接头、测序文库及其构建方法、提升ngs建库连接效率的方法
CN119709950A (zh) Dna建库方法、接头元件及试剂盒
HK40086055B (zh) 一种靶向富集核酸的探针
HK40086055A (zh) 一种靶向富集核酸的探针

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 18016857

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2021951111

Country of ref document: EP

Effective date: 20230131

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21951111

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE