Posts Tagged ‘analysis’

Genomic insights into familial adenomatous polyposis: unraveling a rare case with whole APC gene deletion and … – Nature.com

Familial adenomatous polyposis (FAP) is an autosomal dominant disorder resulting from germline mutations in the APC gene. The APC gene, comprising 15 exons and encoding a protein with 2843 amino acids, is implicated in ~80% of FAP cases1. Extensive genetic analysis has revealed germline variants in FAP patients, and most APC mutations are found in the 5 half of the coding region. Genotypephenotype correlations have been reported for small-nucleotide alterations, including frameshift and nonsense mutations2,3. Large genomic deletions and duplications have been identified using multiplex ligation-dependent probe amplification (MLPA)4. Whole-genome array comparative genomic hybridization (aCGH) was used to identify a large deletion involving the middle portion of the long arm of chromosome 55. Here, we report a case of an FAP patient with intellectual disability that was attributed to a large deletion involving 5q22.2.

The proband was a 28-year-old female who was referred to the emergency hospital with acute abdominal pain. Computed tomography (CT) demonstrated perforation of the descending colon, multiple colorectal polyps, multiple liver metastases and lymph node swelling. She underwent left hemicolectomy, and the subsequent histological diagnosis was moderately differentiated adenocarcinoma (pT4a, pStage IVa). Chemotherapy was selected for treatment of the residual metastasis. Colonoscopy revealed advanced colon cancer with multiple adenomatous polyps (>100). Head CT revealed an osteoma in her skull, and the phenotype was subsequently defined as Gardners syndrome.

The patient had slight intellectual disability without developmental delay or neurogenic abnormalities. She and her mother requested comprehensive genomic panel (CGP) analysis (OncoGuideTM NCC oncopanel, Sysmex, Hyogo, Japan) of surgically resected colon cancer tissue after providing informed consent. This test can detect mutations in 124 genes and differentiate between germline and somatic mutations. The pathogenic mutations detected were KRAS G13D, PIC3CA H1047R, and TP53 M169fs*2, but no targeted therapy was recommended by the expert panel. No germline findings were reported, but whole APC gene deletion was suspected due to the low amplicon depth of the APC gene in both the tumor tissue and blood samples (Fig. S1).

According to her familial history (Fig. 1), her mother (II-3) was treated for sporadic colon cancer. She refused genetic testing due to receiving cancer chemotherapy. Her son (IV-1), whose intelligence was slightly low, had a single-parent history because his father was not identified.

The arrow indicates the patients who underwent genetic counseling. A closed circle indicates an individual with colorectal cancer. Colorectal polyposis was observed in the proband (III-1) but not in her ancestors.

After genetic counseling, aCGH (GenetiSure Dx Postnatal Assay, Agilent, Tokyo, Japan) was performed for further genetic testing. Notably, aCGH revealed the loss of chromosome 5 (chr5) q22.1-q22.2 (Fig. 2), the loss of chr3 p24.1-p23, and the gain of chr15 q15.3. The chr5 deletion included the entire APC gene (chr5:112043195-112181936 in GRCh37) located at 5q22.2 (Fig. S2), according to the Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources (DECIPHER, https://www.deciphergenomics.org).

A heterozygous 5q22 deletion was detected. The minimal and maximal deletion positions in GRCh37 (start_stop) were 111143360_112213143 and 111118900_112239978, respectively.

This case in which the entire APC gene was deleted, as determined by aCGH, is rare. Chromosome 5p22.1-22.2 deletion causes 1Mb of heterozygous loss, including the APC gene, which was reported as a cytogenetically detected deletion in previous reports. Previously, karyotyping and fluorescence in situ hybridization were used to detect large submicroscopic genomic deletions, and aCGH was used to detect high-resolution copy number variants in whole chromosomes6. aCGH is sensitive and comprehensive, allowing detection of multiple variations, and annotations by specialists are needed. DECIPHER catalogs common copy number changes, enabling the identification of potentially pathogenic variants. aCGH can also be used for sequencing targeted genes. For FAP patients, germline APC variants are identified by direct sequencing using next-generation sequencing (NGS) and MLPA5. Sequencing has been used to detect APC gene variants, but ~20% of FAP patients do not carry these variants. MLPA is useful for detecting whole or large APC gene copy number variants in mutation-negative FAP patients. There are several case reports in which germline variants of FAP were examined via aCGH7,8,9,10.

Our young patient with advanced colon cancer derived from multiple colorectal polyposis was diagnosed with FAP according to the clinical features. A CGP was performed using NGS for cancer precision medicine in this patient. Because metastatic colon cancer is treated by chemotherapy, somatic genomic analysis with CGP was also conducted to determine the optimal chemotherapy regimen. Next, we used NGS to determine the sequence of 100bp amplicons of 124 cancer-related genes from cancer tissue and peripheral blood. A large APC deletion was not detected by this targeted sequence, although both the somatic and germline amplicon depths of the APC gene were slightly low. A large number of APC variants have already been deposited in the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/). For several FAP patients in which germline APC variants were not found, investigations of copy number variations have been performed. The genotypephenotype correlation of patients with chromosome 5q deletions has been discussed10. A classical FAP phenotype is associated with a mutation in codons 1681250 or codons 14001580. A severe phenotype is caused by a mutation in codons 12501464. A more attenuated form is associated with mutations in three regions: the 5 region of the APC gene, the alternative splicing region in exon 9, and the extreme 3 end of the gene11.

Whole or partial APC gene deletions can be detected with recently developed genetic techniques9,10,12. MLPA and aCGH are candidates for confirming large deletions or duplications, and the latter genetic test was chosen for our patient. In our patient, two chromosomal losses and one gain were detected. The advantage of chromosomal analysis is that it can reveal unexpected genetic changes even in separate chromosomes. The CGH database includes some patients with large deletions in chromosomal region 5q22, including the APC gene. In a very recent case report, aCGH was utilized to identify a large 19.85Mb deletion12. A case series with a literature review described a patient with intellectual disability and a colon neoplasm with an interstitial deletion of 5q identified by aCGH. Colorectal cancers are observed in some patients with 5q deletions, yet examination of colorectal polyposis in this context is limited. Among the primary dysmorphisms and symptoms linked to 5q deletions, the predominant manifestation identified in the analysis of 12 patients was mental retardation12. The cases documented in both the literature and the DECIPHER database are characterized by common clinical features, including predisposition to cancer, intellectual disability, and neurodevelopmental delay. Patients with these congenital changes should undergo genetic testing, including G-band, fluorescence in situ hybridization (FISH), and aCGH. aCGH offers high resolution, allowing for the detection of changes at the chromosomal level. This high sensitivity is particularly valuable when conventional methods, such as karyotyping or FISH, may not provide detailed information about genomic alterations. Moreover, this approach allows researchers and clinicians to explore potential genetic factors beyond the well-known APC genes. In the near future, long-read sequencing of large deletions may enable us to obtain detailed genomic information13. Additional clinical information is needed to establish the genotypephenotype correlations associated with the 5q22.2 deletion that includes the whole APC gene. The published cases have raised the question of whether whole APC deletion induces colorectal polyposis. Casper et al. reported a case of Gardner syndrome attributable to a substantial interstitial deletion of chromosome 5q, offering a comprehensive review of published cases9. Until 2014, 16 patients with FAP resulting from chromosome 5q deletions were documented, with all but one patient presenting with classic adenomatous polyposis rather than the profuse form. Most of these deletions were de novo alterations, consistent with our reported case in which the patients mother (II-3) exhibited sporadic colon cancer without polyposis. In the familial lineage (Fig. 1), our patients son (IV-1) carried a deletion in the 5q22.1-22.2 region, mirroring the genomic alteration of his mother (III-1). However, the genetic inheritance pattern of this large deletion is unclear. Meticulous follow-up of the young boy is important for addressing this issue.

In conclusion, this study describes a rare FAP patient characterized by a large deletion of chromosome 5q22.1-22.2 identified through comprehensive genomic analysis. The genetic variant was suspected by CGP and eventually identified by aCGH. These findings emphasize the importance of advanced genetic techniques in identifying complex genomic variations and suggest a need for additional research to elucidate the specific features associated with whole-APC gene deletions.

Link:
Genomic insights into familial adenomatous polyposis: unraveling a rare case with whole APC gene deletion and ... - Nature.com

Genetic variation passed down through generations may influence cancer development – Baylor College of Medicine | BCM

Genes affected by germline structural variation could conceivably influence cancer risk.

Researchers at Baylor College of Medicines Dan L Duncan Comprehensive Cancer Center and Human Genome Sequencing Center investigated the extent to which forms of genetic variation called germline or inherited structural variation (SV) influence gene expression in human cancers.

Structural variation is one type of genomic variation and can be beneficial, neutral or, if it affects functionally relevant regions of the genome, can seriously affect gene function and contribute to disease, including cancer, said corresponding author Dr. Chad Creighton, professor ofmedicineand co-director of cancer bioinformatics at theDan L Duncan Comprehensive Cancer Centerat Baylor.

Structural variations are larger differences in the genome that occur when a piece of DNA is duplicated, deleted, or switched around, which can impact genetic instructions encoded in DNA and affect the expression of nearby genes. Previous studies led by the researchers have shown that structural variations occurring in specific cell types, like breast cells, can strongly influence gene expression in ways that contribute to transforming a healthy breast cell into a cancer cell.

Its known that germline structural variation also can contribute to the molecular profile of cancers, Creighton said. Here we study the extent of its contribution. The study is published in Cell Reports Medicine.

The researchers worked with data developed by the Pan-Cancer Analysis of Whole Genomes consortium, which includes whole genome sequencing data from 2,658 cancers across 38 tumor types involving 20 major tissues of origin. The team integrated these data with RNA data to identify genes whose expression was associated with nearby germline structural variations.

We found most of the genes associated with germline structural variations would not necessarily have specific roles in cancer, but for some genes, the expression variation might be associated with other conditions, Creighton said.

At the same time, several genes affected by germline structural variation could conceivably contribute to cancer, for instance if these genes have an established cancer association or an association with patient survival.

This study shows that germline structural variation would represent a normal class of genetic variation passed down through generations and may play a significant role in cancer development. The researchers propose that the subset of genes with cancer-relevant associations arising in this study would represent strong candidates for further investigation on their value in genetic testing.

Fengju Chen, Yiqun Zhang and Fritz J. Sedlazeck also contributed to this work.

This study was supported by the National Institutes of Health grant P30CA125123.

By Ana Mara Rodrguez, Ph.D.

Follow From the Labs on X @BCMFromtheLabs and Instagram!

Read this article:
Genetic variation passed down through generations may influence cancer development - Baylor College of Medicine | BCM

Microplastics dampen the self-renewal of hematopoietic stem cells by disrupting the gut microbiota-hypoxanthine-Wnt … – Nature.com

Mice

C57BL/6J (CD45.2) and C57BL6.SJL (CD45.1) mice were purchased from The Jackson Laboratory and housed under specific pathogen-free conditions. Male and female mice from 8 to 12 weeks were used in experiments and provided with a suitable environment and sufficient water and food. After a week of acclimatization, each mouse was randomly divided into groups, given 100L pure water, 0.01mg/100L, or 0.1mg/100L MPs by oral gavage every two days for five weeks in a gavage experiment (n=5 for each group). For the intravenous injection experiment, MPs were administered into mouse blood via the tail vein at a rate of 0.1g/100L per week for a duration of 4 weeks (n=5 for each group). All animal experiments were first approved by the Laboratory Animal Welfare and Ethics Committee of Zhejiang University (AP CODE: ZJU20220108).

Indocyanine green polystyrene (ICG-PS), polystyrene (PS) and polymethyl methacrylate (PMMA) particles were obtained from Suzhou Mylife Advanced Material Technology Company (China). Polyethylene (PE) particles were purchased from Cospheric (USA). Scanning electron microscopy (SEM, Nova Nano 450, FEI) was used to characterize the primary sizes and shapes of different MPs20. MPs were dispersed in ultrapure water with sonication before dynamic light scattering analysis (Zetasizer, Malvern, UK) to determine the hydrodynamic sizes and zeta potentials49.

Mice were sacrificed and organs were removed within six hours of ICG-PS gavage, including the heart, lung, kidney, spleen, liver, gastrointestinal tissues and bone marrow. Feces were collected 1h before the mice were sacrificed. Both organs and feces were monitored by ex vivo bioluminescence imaging with a small-animal imaging system50 (IVIS Spectrum, PerkinElmer).

For flow cytometry analysis and isolation of hematopoietic stem and progenitor cells, cells were stained with relevant antibodies51 in PBS with 2% fetal bovine serum for 3045min on ice. Antibody clones that were used: Sca-1-PE-Cy7, c-Kit-APC, CD150-PE, CD48-BV421, CD45.1-FITC, CD45.2 PE-Cy5, Gr-1-PE-Cy5, Mac1-PE-Cy5, IgM-PE-Cy5, CD3-PE-Cy5, CD4- PE-Cy5, CD8-PE-Cy5, CD45R-PE-Cy5 and Ter-119-PE-Cy5. Detailed antibody information is summarized in Supplementary Table S6. HSPCs were stained with a lineage antibody cocktail (Gr-1, Mac1, CD3, CD4, CD8, CD45R, TER119 and B220), Sca-1, c-Kit, CD150 and CD48. Cell types were defined as followed: LSK compartment (LinSca-1+c-Kit+), LT-HSC (LSK CD150+CD48), ST-HSC (LSK CD150CD48), MPP2 (LSK CD150+CD48+) and MPP3/4 (LSK CD150CD48+). B cells (CD45.2+Mac1Gr-1+B220+), T cells (CD45.2+Mac1Gr-1+CD3+) and myeloid cells (CD45.2+Mac1+Gr-1). Samples were analyzed on a flow cytometer (CytoFLEX LX, Beckman). For sorting HSCs, lineage antibody cocktail-conjugated paramagnetic microbeads and MACS separation columns (Miltenyi Biotec) were used to enrich Lin cells before sorting. Stained cells were re-suspended in PBS with 2% FBS and sorted directly using the Beckman moflo Astrios EQ (Beckman). Flow cytometry data were analyzed by FlowJo (BD) software.

Apoptosis of cells was detected by Annexin V staining (Yeason, China). After being extracted from the bone marrow of mice, 5106 cells were labeled with different surface markers for 30 to 45min at 4C and then twice rinsed with PBS. Subsequently, the cells were reconstituted in binding buffer and supplemented with Annexin V. After 30min of incubation, flow cytometry was detected in the FITC channel. Cell cycle analysis was performed with the fluorescein Ki-67 set (BD Pharmingen, USA), following the directions provided by the manufacturer. Briefly, a total of 5106 bone marrow cells were labeled with corresponding antibodies, as previously stated. Afterward, the cells were pre-treated with a fixation/permeabilization concentrate (Invitrogen, USA) at 4C overnight and subsequently rinsed with the binding buffer. The cells were stained with Ki-67 antibody for 1h in the dark and then with DAPI (Invitrogen) for another 5min at room temperature. Flow cytometry data were collected by a flow cytometer (CytoFLEX LX, Beckman, USA).

HSCs were sorted by flow cytometry according to the experimental group (ctrl and PSH mice, Rikenellaceae treatment or hypoxanthine treatment). 150 HSCs were seeded in triplicate on methylcellulose media52 (M3434, Stemcell Technologies, Inc.). After 8 days, the number of colonies was counted by microscopy. In addition, 5000 BM cells were seeded and analyzed the same way as HSCs. The cell culture media was diluted in PBS and subjected to centrifugation at 400g for 5min to determine the total cell number.

Recipient mice (CD45.1) were administered drinking water with Baytril (250mg/L) for 7 days pre-transplant and 10 days post-transplant. The day before transplantation, recipients received a lethal dose of radiation (4.5Gy at a time, divided into two times with an interval of 4h). In primary transplantation, 2105 bone marrow cells from the ctrl or PS group (CD45.2) mice and 2105 recipient-type (CD45.1) bone marrow cells were transplanted into recipient mice (CD45.1) mice. Cells were injected into recipients via tail vein injection. Donor chimerism was tracked using peripheral blood cells every 4 weeks for at least 16 weeks after transplantation. For secondary transplantation, donor BM cells were collected from primary transplant recipients sacrificed at 16 weeks after transplantation and transplanted at a dosage of 2106 cells into irradiated secondary recipient mice (9Gy). Analysis of donor chimerism and the cycle of transplantation in secondary transplantation were the same as in primary transplantation.

For limiting dilution assays52, 1104, 5104 and 2105 donor-derived bone marrow cells were collected from ctrl or PS mice (CD45.2) and transplanted into irradiated (9Gy) CD45.1 recipient mice with 2105 recipient-type (CD45.1) bone-marrow cells. Limiting dilution analysis was performed using ELDA software53. 16 weeks after transplantation, recipient mice with more than 1% peripheral-blood multilineage chimerism were defined as positive engraftment. On the other hand, recipient mice undergoing transplantation that had died before 16 weeks post transplantation were likewise evaluated as having failed engraftment54.

For histological analysis, small intestines were collected and fixed in 4% paraformaldehyde and embedded in paraffin, sectioned (5m thickness), and stained with H&E at ZJU Animal Histopathology Core Facility (China). We used Chius scores33,34 to evaluate the damage for each sample. The grade was as follows: 0, normal mucosa; 1, development of subepithelial Gruenhagens space at the tip of villus; 2, extension of the Gruenhagens area with moderate epithelial lifting; 3, large epithelial bulge with a few denuded villi; 4, denuded villi with lamina propria and exposed capillaries; and 5, disintegration of the lamina propria, ulceration, and hemorrhage. For TEM analysis, slices of the small intestine were fixed with 2.5% glutaraldehyde for ultra-microstructure observation of intestinal epithelial cells. The samples were postfixed for one hour at 4C with 1% osmium tetroxide and 30min with 2% uranyl acetate, followed by dehydration with a graded series of alcohol solutions (50%, 70%, 90% and 100% for 15min each) and acetone (100% twice for 20min). Subsequently, they were embedded with epon (Sigma-Aldrich, MO, US) and polymerized. Ultrathin sections (6080nm) were made, and examined using TEM (Tecnai G2 Spirit 120kV, Thermo FEI).

In the short-term and long-term mouse models for MP ingestion, mice were fasted for 4h before oral gavage of FITC-dextran (4kD, Sigma). The fluorescence intensity of FITC-dextran (50mg/100g body weight) was measured in the peripheral blood after 2h of gavage. Fluorescence was measured using a microplate reader (Molecular Devices, SpectraMax iD5) with excitation at 490nm and emission at 520 nm29.

Fecal samples (about 3050mg per sample) were collected from the ctrl, PSL and PSH mice, quickly frozen in liquid nitrogen, and stored at 80C. DNA samples for the microbial community were extracted using E.Z.N.A. Stool DNA Kit (Omega, USA), according to the manufacturers instructions. In brief, polymerase chain reaction (PCR) amplification of prokaryotic 16S rDNA gene V3V4 region was performed using the forward primer 341F (5-CCTACGGGNGGCWGCAG-3) and the reverse primer 805R (5-GACTACHVGGGTATCTAATCC-3)55. After 35 cycles of PCR, sequencing adapters and barcodes were included to facilitate amplification. The PCR products were detected by 1.5% agarose gel electrophoresis and were further purified using AMPure XT beads (Beckman Coulter Genomics, Danvers, MA, USA), while the target fragments were recovered using the AxyPrep PCR Cleanup Kit (Axygen, USA). In addition, the amplicon library was quantified with the Library Quantification Kit for Illumina (Kapa Biosciences, Woburn, MA, USA), and sequenced on the Illumina NovaSeq PE250 platform. In bioinformatics pipeline29,56, the assignment of paired-end reads to samples was determined by their unique barcode, and subsequently shortened by cutting off the barcode and primer sequence. The paired-end reads were combined by FLASH (v1.2.8). Quality filtering on the raw reads was carried out under precise parameters to obtain high-quality clean tags according to fqtrim (v0.94). The chimeric sequences were filtered by Vsearch software (v2.3.4). After the dereplication process using DADA2, we acquired a feature table and feature sequence. The bacterial sequence fragments obtained were grouped into Operational Taxonomic Units (OTUs) and compared to the Greengenes microbial gene database using QIIME2. Alpha diversity and beta diversity were generated by QIIME2, and pictures were drawn by R (v3.2.0). The species annotation sequence alignment was performed by Blast, with the SILVA and NT-16S databases as the alignment references. Additional sequencing results are provided in Supplementary Table S1. The experiment was supported by Lc-Bio Technologies.

The methods for the analysis of feces from HSCT donors were slightly different from those used for mice. All samples were stored in the GUHE Flora Storage buffer (GUHE Laboratories, China). The bacterial genomic DNA was extracted with the GHFDE100 DNA isolation kit (GUHE Laboratories, China) and quantified using a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific, USA). The V4 region of the bacterial 16S rDNA genes was amplified by PCR, with the forward primer 515F (5-GTGCCAGCMGCCGCGGTAA-3) and the reverse primer 806R (5-GGACTACHVGGGTWTCTAAT-3). PCR amplicons were purified with Agencourt AMPure XP Beads (Beckman Coulter, IN) and quantified by the PicoGreen dsDNA Assay Kit (Invitrogen, USA). Following the previously reported steps57, the paired-end 2150bp sequencing was performed on the Illumina NovaSeq6000 platform. The details of bacterial OTUs are summarized in Supplementary Table S5. Sequence data analyses were performed using QIIME2 and R packages (v3.2.0).

For metabolite evaluation, samples from mice feces were prepared and detected as previously described55,58,59. In a nutshell, metabolites were extracted from feces through precooled 50% methanol buffer and stored at 80C before the LCMS analysis. All chromatographic separations were conducted using an ultra-performance liquid chromatography (UPLC) system (SCIEX, UK). A reversed phase separation was performed using an ACQUITY UPLC T3 column (100mm * 2.1mm, 1.8m, Waters, UK). The temperature of the column oven was maintained at 35C and the flow rate was 0.4mL/min. Both positive (the ionspray voltage floating set at 5000V) and negative ion modes (4500V) were analyzed using a TripleTOF 5600 Plus high-resolution tandem mass spectrometer (SCIEX, UK). The mass spectrometry data were obtained in Interactive Disassembler Professional (IDA) mode, with a time-of-flight (TOF) mass range of 60 to 1200Da. The survey scans were acquired in 150 milliseconds and product ion scans with a charge state of 1+ and 100 counts per second (counts/s) were recorded up to 12. Cycle duration was 0.56s. Stringent quality assurance (QA) and quality control (QC) procedures were applied, as the mass accuracy was calibrated every 20 samples and a QC sample was obtained every 10 samples. LCMS raw data files underwent processing in XCMS (Scripps, La Jolla, CA) to perform peak picking, peak alignment, gap filling, and sample normalization. Online KEGG was adopted to annotate metabolites through the matching between the precise molecular mass data (m/z) of samples and those from the database. PCA and volcano plot were utilized to identify ion characteristics that exhibit significant differences between the groups. The details of metabolomes can be found in Supplementary Table S2. The experiment was supported by Lc-Bio Technologies.

Before FMT, SPF mice received a 200L antibiotic treatment (1g/L ampicillin, 0.5g/L neomycin, 0.5g/L vancomycin and 1g/L metronidazole) for three consecutive days by oral gavage. Fresh feces were collected from ctrl or PS mice and resuspended in reduced PBS (0.5g/L cysteine and 0.2g/L Na2S in PBS) at a ratio of about 120mg feces/mL reduced PBS. Feces were then centrifuged at 500g for 1min to remove insolubilize particles25. Recipients (C57BL/6J mice) were administered 100mL of the supernatant from different groups by oral gavage twice every week for 4 weeks. 2 days after the last FMT, recipients were euthanized to analyze the changes in the hematopoietic system.

The Rikenellaceae strain (ATCC BAA-1961), purchased from ATCC, was cultured in an anaerobic chamber using BD Difco Dehydrated Culture Media: Reinforced Clostridial Medium at a temperature of 37C with a gas mixture of 80% N2 and 20% CO2. The final concentration of Rikenellaceae was 2108 viable c.f.u. per 100L and hypoxanthine (200mg/kg, Sigma, Germany) was dissolved in double distilled water29. Mice first received antibiotic treatment (same as FMT) and were then treated by oral gavage with 100L of either Rikenellaceae or hypoxanthine suspension three times a week for 4 weeks. Reinforced Clostridial Medium or double distilled water was used as a vehicle control, respectively. 2 days after the last administration, recipients were euthanized to analyze the changes in the hematopoietic system. To examine the impact of hypoxanthine on HSCs, we exposed bone marrow cells to direct co-culture with hypoxanthine at a concentration of 100pg/mL for a period of 3 days.

Mouse bone marrow cells were harvested by flushing the mices tibia and femur in phosphate buffered saline (PBS) with 2% fetal bovine serum (GIBCO). Harvested cells were grown into 96-well u-bottom plates containing freshly made HSC culture medium (StemSpanTM SFEM, Stemcell Tec.) with SCF (50ng/mL; PeproTech) and TPO (50ng/mL; PeproTech), at 37C with 5% CO2. For HSC culture, the medium was changed every 3 days by manually removing half of the conditioned medium and replacing it with fresh medium60. To assess the effects of WNT10A, IL-17, TNF and NF-kappa B on hematopoiesis, we cultured HSCs in a basic medium and supplemented them with related proteins (10ng/mL; Cosmo Bio, USA) or PBS as a control for two days, followed by flow cytometry analysis. Different concentrations of PS were added to the medium and tested using CCK-8 and FACS to detect the effect of MPs on cultured HSCs.

1104 HSCs were obtained in triplicate from mouse bone marrow cells from the ctrl or PSH group by flow cytometry sorting and RNA was extracted with RNAiso Plus (Takara, Japan) according to the manufacturers protocol. The concentration and integrity of RNA were examined by Qubit 2.0 and Agilent 2100 (Novogene, China), respectively. Oligo (dT)-coated magnetic beads (Novogene, China) were used to enrich eukaryotic mRNA. After cDNA synthesis and PCR amplification, the PCR product was purified using AMPure XP beads (Novogene, China) to obtain the final library. The Illumina high-throughput sequencing platform NovaSeq 6000 was used for sequencing. Analysis of gene expression was calculated by R or the DESeq2 package61. Detailed information regarding RNA-seq is listed in Supplementary Table S3.

For RNA expression analysis, total RNA from bone marrow cells was extracted using Trizol (Invitrogen, US) and resuspended in nuclease-free water. Reverse transcription was performed using the QuantiTect Reverse Transcription kit (Qiagen NV). qPCR was conducted using cDNA, primers and SYBR-green (Takara, Japan) in 20L using the ABI 7500 Q-PCR system62. Results were calculated using the RQ value (RQ=2Ct). Mouse Actin was chosen as the normalization control. Gene-specific primer sequences are shown in Supplementary Table S7.

Bone marrow and Rikenellaceae supernatant in different groups were obtained by centrifugation. Fecal supernatant was obtained from human samples. Hypoxanthine (LANSO, China) and WNT10A (EIAab, China) were measured by ELISA with respective kits according to the manufacturers protocols.

Human feces and peripheral blood samples were obtained from 14 subjects who provided grafts for HSCT patients. They were divided into graft success group and graft failure (GS)/poor graft function (GF/PGF) group, with 7 participants in each group. Research involving humans was approved by the Clinical Research Ethics Committee of the First Affiliated Hospital, College of Medicine, Zhejiang University (IIT20230067B). All participants read and signed the informed consent. Detailed information on patients was listed in Supplementary Table S4.

The Agilent 8700 Laser Direct Infrared Imaging system was utilized for fast and automated analysis of MPs in feces received from donors. An excessive nitric acid concentration (68%) was added to the sample and heated to dissolve the protein. Large particles were first intercepted with a large aperture filter and then filtered by vacuum extraction. After rinsing with ultra-pure water and ethanol several times, the materials, including MPs, were dispersed in the ethanol solution. The LDIR test was carried out when the ethanol was completely volatilized63. The sample of MPs was positioned on the standard sample stage. The stage was then put into the sample stage, and the Agilent Clarity was initiated to advance the sample stage into the sample chamber. The software rapidly scanned the chosen test area using a constant wave number of 1800cm1, and accurately detected and pinpointed the particles within the selected area. The unoccupied area devoid of particles was automatically designated as the background. The background spectrum was gathered and readjusted, followed by the visualization of detected particles and the collection of the whole infrared spectrum. After obtaining the particle spectrum, the spectrum library was utilized to carry out qualitative analysis automatically, including the inclusion picture, size, and area of each particle. The test was supported by Shanghai WEIPU Testing Technology Group.

MPs in peripheral blood from donors were tested by Py-GC/MS. Nitric acid was added to samples for digestion at 110C for 12h, and then used deionized water to make the solution weakly acidic. After concentration, the solution was dribbled into the sampling crucible of Py-GCMS and tested when the solvent in the crucible was completely volatilized17. Various standards of MPs were prepared and analyzed using Py-GCMS in order to construct the quantitative curve. PY-3030D Frontier was employed for lysis, with a lysis temperature set at 550 C. The chromatographic column dimensions were 30m in length, 0.25mm inner diameter, and 0.25m film thickness. The sample was subjected to a heat preservation period of 2min at 40C, followed by a gradual increase in temperature at a rate of around 20C per minute until it reached 320C. The sample was maintained at this temperature for 14min and the entire process takes a total of 30min. The carrier gas utilized was helium, with the ion source temperature of 230C. The split ratio employed was 5:1, and the m/z scan range spanned from 40 to 60064. The experiment was supported by Shanghai WEIPU Testing Technology Group.

Each animal experiment was tested using at least 56 replicates and each in vitro experiment was at least three replicates. Specific replication details are provided in relevant figure captions. Statistical significance was ascertained through unpaired two-tailed t-tests by GraphPad Prism when the P value was less than 0.05. Error bars in all figures indicate the standard deviation (SD).

Link:
Microplastics dampen the self-renewal of hematopoietic stem cells by disrupting the gut microbiota-hypoxanthine-Wnt ... - Nature.com

Developmental progression of DNA double-strand break repair deciphered by a single-allele resolution mutation … – Nature.com

ICP: an integrated pipeline for classifying CRISPR/Cas9 induced mutant alleles

We developed an integrated bioinformatic tool ICP (Integrated Classifier Pipeline), to parse complex DSB repair outcomes induced by CRISPR/Cas9 and automatically call for experimental errors generated during NGS library preparation and sequencing: 1) a Nucleotide Position Classifier (NPClassifier), and 2) a Single Allele-resolution Classifier (SAClassifier). We employed these two complementary sequence analysis modules in tandem to enable in-depth interpretation of deep sequencing data at single allele resolution (Fig.1ac, see Methods section for detailed description of ICP tools). In line with the unique DNA signatures generated by distinct DSB repair pathways, we categorized the repair products into four major categories. Alleles with a deletion only on the PAM-distal side (PAM-proximal side was protected by Cas9 protein after cleavage), a common category, were termed as PEPPR class mutations (PAM-End Proximal Protected Repair, PEPPR)41,42. While single strand cleavage by the Cas9 RuvC domain can also nick the non-complementary strand at locations beyond the canonical site between the 6th and 7th nucleotide upstream of the PAM sequence, we restrict our analysis here to the majority cases wherein Cas9 cleavage generates blunt DSB ends to simplify the robust classification scheme developed in this study43,44,45. Mutant alleles judged to be generated by directly annealing 2bp microhomology sequences spanning the gRNA cleavage site were assigned into MMEJ class (again acknowledging that such alleles can also be generated with 1bp microhomology sequence, which however, are not readily amenable to the semi-automated analysis we developed)46,47,48, while pure deletion alleles not belonging to either the PEPPR or MMEJ categories were classified as DELET class mutations. Remaining alleles that include insertions-only and indels (deletion plus insertion) were categorized as insertion class (INSRT) mutations (Fig.1b).

The process of DSB repair pattern profiling consists of preparing a NGS library (a), classifying the resulting parsed alleles (b) and displaying processed alleles by rank order and class of mutations (c). a NGS library preparation: Genomic DNA from F1 test flies carrying both Cas9 and gRNA expressing cassettes either maternally (dark blue bars) or paternally (red bars, or progeny from other designated crosses) are subjected for targeted PCR amplification with primers containing Illumina compatible adapters at the 5 terminal to detect somatic indels. The gray rectangle represents a short region of genomic DNA containing a Cas9/gRNA target: purple circle depicts Cas9 protein and sky-blue line is gRNA. b Classification: Raw NGS data are subjected to the NPClassifier to parse alleles into specific primary categories required for building allelic dictionaries used by the SAClassifier. Four major indel groups are categorized: PEPPR (PAM-End Proximal Protected Repair, sky-blue), MMEJ (Microhomology Mediated End-Joining, dark pink), DELET (deletion, any deletions do not belong to PEPPR and MMEJ, orange) and INSRT (insertion, including the alleles only with inserted nucleotides or had deletions and insertions, purple). The 24-nt short PEPPR, MMEJ and DELET dictionaries are used for a more accurate classification and error calling by binning together all alleles with the same seed region that match primary allelic entries in the SAClassifier dictionaries. c DSB repair pattern visualization: intuitive rendering of the processed raw sequence data as an output of rank ordered classes of alleles. Allelic classes derived from NGS sequencing of individual flies or mosquitoes are displayed by their ranked frequency (allele landscape) and repair pattern fingerprints (color-coded by categories).

Briefly, raw reads generated from deep sequencing were subjected to a preliminary categorization using the NPClassifier, which recognizes the relative positions of editing start- and end-points flanking Cas9 cleavage site and then generates a collection of priori alleles for each category. These primary outputs (MMEJ and DELET) were used for building full-length standard comprehensive dictionaries listing all observed mutations and derived 24-nt short dictionaries (with the same seed region flanking the Cas9 cleavage site) as inputs of the SAClassifier. In addition, a synthetic PEPPR dictionary was built by iteratively increasing the length of deletions by a single nucleotide distal to the PAM site, excluding alleles belonging to the MMEJ category. By fishing the raw reads with 24-nt dictionaries, we were able to automatically recognize reads that also contained experimentally generated errors (e.g., from PCR amplification), which usually are located outside of the narrow 24-nt short dictionary window, thereby assigning such composite alleles to correctly matched root alleles (Fig.1b). These dual iteratively employed ICP classification tools provide a robust and precise classification of CRISPR/Cas9 induced DSB repair outcomes. Next, we developed an evocative user-friendly interface to visualize processed allelic category information in the form of rank ordered allelic landscape plots and repair pattern fingerprints (color-coded DSB repair categories), both of which are sorted by read frequency (Fig.1c). These intuitively accessible data outputs are far more informative and discriminating than the unprocessed primary DNA sequence reads (e.g., compare the seemingly idiosyncratic raw lesions depicted in Fig.2a to the obviously unique processed and concordant replicate patterns shown Fig.2b, c). The ICP was thus employed to visualize results in all the following experiments.

a Examples of the top five somatic indels from individual flies derived from split-drive crosses in which the Cas9 transgene is inherited either maternally (Maternal-S, left) or paternally (Paternal-S, right), but separately from a cassette carryingthe gRNAtransmittedby the other parent. Purple stars indicate the color codes for mutation categories (dark pink: MMEJ, sky-blue: PEPPR, orange: DELET, purple: INSRT) and dark green star indicates the separate raw sequence color coded for the four nucleotides A, T, G, and C. The red bar indicates Paternal-S crosses while dark blue bar represents Maternal-S crosses. b Landscapes of top 50 alleles ranked by reads ratio. All six sequenced individual flies are plotted together, with dark blue lines plotting the data from Maternal-S crosses and the red lines from Paternal-S crosses. The y-axis presents the fraction of reads for a given allele and the x-axis depicts the top 50 alleles according to rank order by read frequency. c DSB repair fingerprints for three representative sequenced individual flies from each cross. The x-axis is the same as depicted in panel b. Both panels show the top 50 ranked alleles. d. Bar plots of Class Fraction for top 50 alleles. Color codes for classes are as in panels a and c. Correlation analysis of two out of three replicates from Maternal-S cross (e) or Paternal-S (f) cross. r2 values and p-values are indicated. Source data for panels b, d, e and f are provided as a Source Data file.

Since DSB repair outcomes have been found to vary considerably as a function of Cas9 or gRNA source and level49,50, we employed the ICP platform to parse somatic indels generated by co-expressing Cas9 and gRNAs in somatic cells of fruit flies (Drosophila melanogaster) and mosquitoes (Anopheles stephensi) in various configurations associated with gene-drive systems. We first applied ICP analysis to a split gene-drive system inserted into the Drosophila pale (ple)gene that is designed to detect copying of a gene cassette in somatic cells. This element, referred to as a CopyCatcher (pleCC), carries a gRNA targeting the first intron of Drosophila ple locus49. In this current study, we make use of low-level ectopic somatic Cas9 expression (which is substantial and broad for vasa-Cas9) to analyze DSB repair patterns across diverse cell types in F1 progeny carrying both Cas9 and gRNAs51,52,53. Because cells actively undergoing meiosis make up only a small fraction of dividing cells in an adult fly, the mutational effects of Cas9/gRNA cleavage in such F1 individuals largely reflect the somatic action of these nuclease complexes. We thus conducted several alternative crossingschemes to assess the somatic mutagenic activity of vasa-Cas9 and gRNA components when transmitted to F1 individuals in various configurations from their F0 parents: 1) Maternal Split (Maternal-S, females carrying vasa-Cas9 crossed with males carrying pleCC); 2) Paternal Split (Paternal-S, males carrying vasa-Cas9 crossed with females carrying pleCC); and 3) Maternal Full (Maternal-F, females carrying both the pleCC and vasa-Cas9 transgenes); or Paternal Full (Paternal-F, males carrying both the pleCC and vasa-Cas9 transgenes)49. Comparative ICP analysis revealed several striking and consistent differences between the prevalent somatic mutations generated in individual progeny in each of these different crossing schemes. In the case of Paternal-S crosses, the resulting mutations were dominated by PEPPR alleles (4 out of top 5 alleles in Fig.2a, Fig. S1a, and 70% of the top 50 alleles as rendered in rank ordered allelic landscapes and color coded DSB repair fingerprints in Fig.2c). In contrast, Maternal-S crosses primarily generated MMEJ and INSRT indels (4 out of top 5 alleles were MMEJ, and at least 50% of the top 50 alleles were INSRT mutations, Fig.2a, c, Supplementary Fig. S1a). These differences were also evident in the steeper allelic landscape curves that were generated from the Maternal-S versus Paternal-S crosses (Fig.2b) as characterized by the initial portion of the curve depicting the 5 most frequent alleles (i.e., the dark blue lines in Fig.2b are all above the red lines for the 5 most frequent alleles). We further quantified differences in allelic profiles between crosses by bar plots displaying the summed proportions of the different allelic classes (summing the percentages of all alleles from each category) which we termed as Class Fraction (Fig.2d). This analysis revealed that INSRT alleles were generated at a significantly higher frequency in Maternal-S crosses, while the PEPPR class dominated among the top 50 alleles in the reciprocal Paternal-S crosses (Fig.2d).

A striking feature of the highly divergent DSB repair signatures generated from maternally versus paternally inherited Cas9 sources was the remarkable reproducibility of their DSB repair fingerprints observed across three individual replicates from each cross (Fig.2e, f). We performed a correlation analysis within replicates by extracting 23 common alleles across all six sequenced flies and plotted the resulting allelic profiles together relative to an arbitrarily chosen Paternal-S replicate as reference (bold red line, Supplementary Fig. S1b). We observed that the frequency distributions of these 23 common alleles were much more similar to each other within intra-cross comparisons than between inter-crosses (Supplementary Fig. S1b). This trend was also revealed by higher correlation coefficients for intra-cross comparisons than for inter-cross comparisons based on allelic read ratios (Supplementary Fig. S1cg). Conspicuous defining differences between the Maternal-S and Paternal-S fingerprints were also evident based on the Class Fraction index (Fig.2d). In summary, a variety of differing statistical measurements all underscore the robust consistent similarities shared among allele profiles generated from individual replicates of same cross and clearly distinctive DSB repair pattern fingerprints generated by maternal versus paternal Cas9 inheritance.

We extended our ICP analysis of mutant allele profiles generated in the ple locus to the more extreme Maternal-F (dark blue lines) and Paternal-F (red lines) cross schemes to assess the role of inheritance patterns when both the source of vasa-Cas9 and gRNA originated from a single parent49. Again, we observed highly dominant alleles in the Maternal-F crosses, clearly evident in allelic landscapes, that deviated markedly from those produced by the Paternal-F crosses, which produced more evenly distributed spectra of alleles spread across a broad range of allelic frequencies (Fig.3a, b). As expected based on these large differences, the repair pattern fingerprints generated from different crosses produced clearly distinguishable patterns of mutation classes, which was particularly evident when considering the Class Fraction (Fig.3e). Cumulatively, these data suggest that the developmental timing and/or levels of Cas9 expression (maternal, early zygotic, or late zygotic) are likely to play a key role in determining which particular DSB repair pathway or sub-pathway is engaged in resolving DSBs.

ad Unique DSB repair signatures obtained using different Cas9 sources are displayed with the top 20 alleles (landscapes and DSB repair pattern fingerprints). NGS sequencing was performed on pools of 20 adults. a vasa-Cas9 inserted in the X chromosome and the pleCC element carrying the gRNA were both carried by either female or male parents, mimicking a full-drive configuration (Maternal-F and Paternal-F crosses with vasa-Cas9). b vasa-Cas9 split crosses wherein the Cas9 transgene was transmitted either maternally (Maternal-S) or paternally (Paternal-S) and the pleCC gRNA bearing cassette was carried by the other parent. Same Maternal-S versus Paternal-S crosses as in panel b, but using either actin-Cas9 (c) or nanos-Cas9 (d) sources. e Class Fraction Index for crosses in panels ad. Bars are shaded according to allelic class color codes. f UMAP embedding for visualizing a common set of 59 alleles shared between the four split crosses with actin-Cas9 and vasa-Cas9. Dots represent single alleles, and the colors indicate the allelic category. g Distribution of top 20 alleles generated from single flies derived from across between parents carrying theSpo11 gRNA and vasa-Cas9elements (Paternal-S cross: red lines and Maternal-S cross: dark blue lines). The top plot shows the allelic landscape for the top 20 alleles from all six sequenced single flies and the bottom shows three examples of the classification fingerprints (with all allelic classes condensed into single rows) color coded for the allele categories. h Class Fraction Index for Spo11 gRNA crosses. i, j Correlation analysis between two replicates from each cross. Dark blue is Maternal-S and red is for Paternal-S. r2 values and p-values are indicated. Source data are provided as a Source Data file.

Previous studies have shown that the relative frequencies of NHEJ versus HDR events depend on the source of Cas9 both in terms of timing and level of expression49,50,54. We thus wondered whether ICP analysis would similarly reveal distinct DSB repair outcomes for two additional Cas9 sources (actin-Cas9 and nanos-Cas9, expressing level of Cas9: actin-Cas9>vasa-Cas9>nanos-Cas9) inserted at the same locus with vasa-Cas9 (Fig.3c, d)49.

As was observed for the vasa-Cas9 source, the actin-Cas9 and nanos-Cas9 sources both generated differing allelic landscapes and repair pattern fingerprints when transmitted maternally versus paternally, which also were readily distinguishable from each other (Fig.3bd). Mirroring results with the vasa-Cas9 source, significant differences between the proportions of PEPPR versus MMEJ class among the top 20 alleles were observed in Maternal-S versus Paternal-S crosses for actin-Cas9. For the nanos-Cas9 source, both the MMEJ and INSRT categories were particularly reduced in Paternal-S crosses, although this latter sex-based difference was not as dramatic as for the other Cas9 sources (presumably due to its more germline restricted expression, Fig.3d)55,56. Overall, the general trend once again indicated that maternally inherited Cas9 sources biased somatic DSB repair outcomes in favor of MMEJ and INSRT classes over PEPPR alleles, while paternal transmission of Cas9 generated mutant alleles dominated by PEPPR class alleles (Fig.3e).

Based on the overall similarities of the DSB repair outcomes observed for actin-Cas9 and vasa-Cas9 crosses, we extracted a set of 59 shared alleles that appeared in all sequenced samples and performed UMAP (Uniform Manifold Approximation and Projection) analysis to cluster these common alleles, condensing them into 5 distinct clouds (Fig.3f). Clouds 1, 2, 3, and 4 were dominated by alternative subsets of PEPPR alleles distinguished primarily by the length of deletion (the average deletion sizes were 24bp, 40bp, 31bp for PEPPR Mini, Midi-I and Midi-II cluster, and it was longer than 55bp for PEPPR Maxi cluster), while cloud 5 was predominantly comprised of MMEJ alleles. We reviewed raw sequences for the few trans-cloud assigned alleles and discovered that some of these alleles could be interpreted as having been generated from a second round of repair using one of the core alleles from the same cloud as a repair template. For example, we inferred that allele 58 was actually a PEPPR deletion with several nucleotides potentially having been back-filled. This result is consistent with the previous report that alleles with insertions or complex repair outcomes would be generated from several rounds of synthesis following the generation of a primary deletion event57,58. Assessing the impact of such potential complexities, which we ignore here for simplicity, will require additional future scrutiny. The remainder of these alleles, such as allele 44, could be accounted for variability in the exact Cas9 cleavage site (between the 6th and 7th nucleotidescounting from the PAMside), with an extra nucleotide being deleted on the PAM-proximal side of the gRNA cleavage site (Fig.3f)43,59,60. Since both of these outcomes were rare, we hypothesized second-order origins for such outlier alleles further validate the robust nature of our ICP platform in recognizing core primary categories of DNA repair outcomes. We also analyzed the common 59 alleles by plotting their read frequencies and observed that the differences between the allelic landscapes for the two reciprocal crosses per each Cas9 source mirrored the trend in Fig.3ad described above (Supplementary Fig. S2a, b). Cumulatively, these concordant findings support a key role for theparental origin of Cas9 servingas a major determinant of the DSB repair outcome.

Another obvious determinant of DSB repair outcome is the local genomic DNA context. We assessed the general applicability of theICP by employing it to classify alleles generated by gRNAs targeting four other loci: prosalpha2 (pros2), Rab11, Spo11 and Rab5 using the vasa-Cas9 source61. Paralleling our findings from the ple locus, we observed divergent allelic profiles between Paternal-S and Maternal-S crosses with distinct dominant mutation categories based on the specific target site. For example, the predominant allelic classes generated at the Spo11, pros2 and Rab11 loci were PEPPR and INSRT alleles, while PEPPR and MMEJ alleles were most prevalent for the Rab5 targets (Fig.3g, h, Supplementary Figs. S36). Among these four targets, Spo11 displayed the greatest divergence in the prevalence of top alleles generated from Maternal-S and Paternal-S crosses (reminiscent of the fine distinctions parsed for the ple locus, Fig.3g). We nonetheless still observed high correlation coefficients between two replicates within the same cross and significantly lower correlation coefficients associated with inter-cross comparisons between maternal versus paternal Cas9 inheritance (averaged r2=0.33, Fig.3i, j, Supplementary Fig. S3). We also observed distinctive sex-specific DSB repair patterns for Cas9 transmission at the pros2 and Rab11 gRNAs targeting sites (Supplementary Figs. S4 and S5), although these differences were less pronounced than for ple and Spo11 gRNAs, while for Rab5, the allelic patterns were similar for both maternal and paternal crosses (Supplementary Fig. S6, see Supplementary Discussion Section). In summary, these data support the broad utility of the ICP pipeline to deliver unique discernable locus-specific fingerprints associated with distinct parental inheritance patterns of Cas9 that generalize to other genomic targets.

Given the strong Cas9 inheritance-dependent distinctions observed for allelic profiles resulting from maternal versus paternal Cas9/gRNA-induced DSBs in Drosophila, we wondered whether similar DSB repair pattern fingerprints could be discerned in mosquitoes carrying a linked full gene-drive in which the Cas9 and gRNA transgenes are carried together in a single cassette62,63,64,65. We examined this possibility using the transgenic An. stephensi Reckh drive,which is inserted into the kynurenine hydroxylase (kh) locus63. Because of the Cas9 and gRNA linkage, the Reckh drive behaves as the Maternal-F and Paternal-F cross configurations described above in which all CRISPR components are carried by a single parental sex63.

Consistent with our observations in flies, the Reckh Maternal-F crosses generated a high proportion of indels that were dominated to a remarkable extent by single mutant alleles with read percentages exceeding 85% for each of the three single mosquitoes sequenced, followed by a long distributed tail of lower frequency alleles. The highly biased nature of the replicate allelic distributions is readily revealed by a virtual step-function in their rank-ordered allelic landscapes (Fig.4a). In striking contrast, over 50% alleles recovered from the Paternal-F crosses were wild-type (WT), which presumably reflects alleles that either remained uncut or DSB ends that were rejoined accurately without further editing. The highly predominant WT allele was followed by a very shallow tail distribution of low frequency mutant alleles in the paternal rank-ordered allelic landscapes (Fig.4a). This dramatic difference in allelic profiles between Maternal-F versus Paternal-F crosses was also clearly displayed by the class-tally bars color coded for the different fractions of each class (black = WT) located beneath each landscape (Fig.4a). Here, the Class Fraction Index measure indicated that Maternal-F crosses generated a greater proportion of INSRT alleles in the first two samples, while Paternal-F crosses produced a high frequency of PEPPR alleles (Fig.4b). As in the case of allelic profiles recovered at the ple and Spo11 loci in flies, common sets of highly correlated mutant DSB repair fingerprints were observed across all three replicates of the Paternal-F Reckh crosses (Supplementary Fig. S7). A similar comparison of allelic distributions in the maternal crosses was precluded by virtue of the single highly dominant alleles and corresponding paucity of lower frequency events, the nature of which varied greatly between replicates. We conclude that the high-resolution performance of the ICP platform in Drosophila can be generalized to other insects such as An. stephensi to robustly discern sex-dependent CRISPR transmission patterns resulting in distinct DSB repair outcomes.

a Rank-ordered landscapes of the top 50 alleles generated from NGS analysis of single mosquitoes. Colored bars with red dots indicate mutated alleles, and black bars with black dots indicate an unmutated WT allele. Middle panels: allelic class fingerprints color coded as in previous figures. Bottom bars: fraction of each allelic class, including WT (black), PEPPR (sky-blue), MMEJ (deep pink), DELET (orange) and INSRT (purple). Numbers indicate the percentage of the corresponding class. b Class Fraction Index for single mosquito sequencing data in panel a. c Developmental time-points for sample collections. d Kinetics of Cas9 mutagenesis generated by the Reckh gRNA. Lines represent the summed fraction of mutant alleles at each time-point. Dark-blue lines indicate maternal (Maternal-F) crosses and red lines paternal (Paternal-F) crosses. e DSB repair fingerprints at different timepoints. Samples were collected at the time points shown in panel c and 20 eggs, larvae, pupae or adults were pooled together for genomic DNA extraction and deep sequencing. The far left and far right panels indicate the Class percentages including WT alleles (black), displaying the proportion of each class at single time-points. Source data are provided as a Source Data file.

Given the dramatic differences we observed in the frequency and nature of somatic alleles generated in maternal versus paternal-sourced Cas9 in both flies and mosquitoes, we wondered whether the developmental timing of Cas9/gRNA expression (maternal=early? and paternal=late?) was the key determinant for these highly reproducible DSB repair fingerprints. We tested this hypothesis by assessing whether DSB repair fingerprints varied as a function of developmental progression using a series of narrowly timed sample collections of F1 mosquitoes produced from crosses of Reckh parents to WT and assayed DSB repair spectra using the ICP pipeline at 12 different developmental stages (Fig.4c. Note: as homozygous Reckh transgenic mosquitoes were crossed to WT, all F1 progeny carried one Reckh allele and one WT receiver allele, the latter of which was amplified for DSB repair analysis). We tracked a diminishing proportion of WT (presumably uncut) alleles and a corresponding increase in mutant alleles of various classes at each of the time points (Fig.4d). Strikingly, nearly half of the target alleles were edited in embryos by 30minutes post-oviposition for both the Maternal-F and Paternal-F Reckh crosses, which corresponds to early pre-blastoderm stages prior to the maternal-to-zygotic transition, suggesting a very early activity of Cas9 in mosquito embryos driven either by maternally inherited Cas9/gRNA complexes or potentially by very early zygotic expression of the Cas9 and gRNA components (Fig.4d)66. We also observed similarly frequent indels being generated as early as 30min in flies expressing Cas9 (either maternally or paternally) with a gRNA targeting the pros2 locus, although the dynamics of Cas9 production are distinct in these two organisms (Supplementary Fig. S8a). Following this initial surge in target cleavage, we observed divergent trajectories in the accumulation of mutant alleles between maternal versus paternal lineages. As an overall trend, mutant alleles accumulated progressively in the Maternal-F lineage until virtually no WT alleles remained, while in Paternal-F lineage, even at the endpoint of adulthood, approximately 60% of WT alleles persisted, in line with our single time point experiments (Fig.4a, d, Supplementary Fig. S8b). As observed in the final distributions of adult alleles, progeny from Maternal-F crosses tended to be enriched for INSRT alleles over the entire developmental time course, while PEPPR alleles were more common in Paternal-F crosses with pronounced accumulation of such alleles during later stages (Fig.4e). A finer scale analysis of the categories of mutant alleles generated over time revealed dynamic patterns of prevalent alleles during mosquito developmental stages (Fig.4e). For example, the proportion of MMEJ alleles peaked at the 2-hour and 4-hour time points (Fig.4e). Similarly, a split-drive expressing a gRNA targeting the Drosophila pros2 locus generated distinct temporal profiles of cleavage patterns in crosses from female versus male parents carrying the drive element (Supplementary Fig. S9).

One unexpected feature of the developmental variations in allelic composition we observed was that the proportion of WT alleles increased at certain time points (e.g., 1-hour in maternal cross and 12-hour - day 1=24h in paternal cross). These temporal fluctuations were also observed in flies expressing Cas9 and a pros2 gRNA at two hours after oviposition (Supplementary Figs. S8a and S9), revealing that this phenomenon might reflect a generally relevant form of clonal selection for WT cells during pre-blastoderm stages. The latter clonal selection might arise if mutant cells experienced negative selection at certain development stages. In the case of paternal transmission, one strong line of evidence supporting this WT clonal selection hypothesis is that in adults, the Reckh element is transmitted to over 99% of F1 progeny, indicating that nearly all target alleles in the germline must be WT. This high frequency of paternal germline transmission is also consistent with the high prevalence of WT alleles tallied at 12h in embryos derived from the paternal crosses (Fig.4e, see Supplementary Discussion Section for more in-depth consideration of this point). We analyzed the developmental distributions of 21 common alleles that were generated at all time-points (Supplementary Fig. S10ae). Most of these common alleles belonged to the PEPPR class, while only five were INSRT alleles, despite the INSRT class overall being the most prevalent for both crosses, again suggesting that INSRT alleles have a higher diversity than other mutation categories (Supplementary Fig. S10a). Overall, this analysis is in line with our previous observation that Maternal-F crosses produced more INSRT alleles while Paternal-F crosses generated a preponderance of PEPPR alleles (Supplementary Fig. S10b).

Given the strong influence of maternal versus paternal origin of Cas9 on the resulting distributions of alleles characterized above by ICP analysis, we wondered whether such allelic signatures could be exploited for lineage tracing in randomly mating multi-generational population cages. We first examined ICP outputs from a controlled crossing scheme carried out over three generations with pleCC and Reckh gRNAs to derive allelic fingerprints distinguishing parents of origin by identifying both somatic alleles in the F1 generation as well as assessment of which of those alleles might be transmitted through the germline to non-fluorescent progeny (i.e., those not inheriting the pleCC or Reckh element) at the F2 generation (Fig.5ad, Supplementary Fig. S11). As anticipated, in both pleCC and Reckh Maternal-F crosses, single dominant somatic alleles were observed in the F1 generation, with the top single allele representing more than 50% of all alleles (Fig.5a, c). Furthermore, all such predominant somatic mutant alleles, which precluded gene-cassette copying of the pleCC or Reckh drive elements in those F1 individuals, were transmitted faithfully through the germline to non-fluorescent F2 progeny with approximately 50% frequency. Furthermore, we observed marked differences in the other half of total reads in F2 progeny depending on the origin of Cas9/gRNA complexes. Thus, a distribution of multiple diverse low frequency mutations were generated when crossing F1 pleCC+ or Reckh+ females with WT males (presumably derived from F1 drive females having deposited Cas9/gRNA complexes maternally that then acted on the paternally sourced WT allele somatically in F2 individuals). In the reciprocal male cross, however, approximately 50% of all alleles remained WT (Fig.5b, d, Supplementary Fig. S12af). These findings support the hypothesis that the top somatic indels derived from maternal Cas9 sources were generated at very early developmental stages (possibly at the point of fertilization or shortly thereafter during the first somatic cell division), resulting in a single mutant allele being initially produced and then transmitted to every descendent cell including all germline progenitor cells49. With the paternal-sourced Cas9 and gRNA, arrays of variable somatic mutations were recovered with the most prominent alleles accounting for fewer than 10% of the total alleles in F1 progeny (Fig.5b). Accordingly, paternally generated F1 somatic alleles were more randomly transmitted via the germline of individuals that failed to copy the gene cassette for either the pleCC or Reckh elements. As a result of this diversity of somatic F1 alleles, only occasionally were the most prevalent alleles also transmitted through germline (e.g., individuals 1, 4 and 5 in Fig.5b, Supplementary Fig. S12gl).

Primary DNA sequences of top single alleles and their percentages of the total alleles from six individual sequenced flies derived from ple gRNA Maternal-F (a) and Paternal-F (b) crosses. Gray bars indicate the location of the gRNA protospacer and red arrowheads are the associated PAM sites. The first row depicts the reference sequence covering the expected DSB cleavage site. Colored squares in the right column indicate the class to which a given allele belongs to. The tables shown on the right of each allele show its frequency among all reads. Left columns of the table indicate frequencies of the somatic allele, and the right columns are the top germline mutant allele frequency obtained by sequencing F2 non-fluorescence progeny derived from same F1 individuals whose top somatic allele is displayed in the left column (excluding WT alleles). Colored dots indicate different alleles with the same color shared between two columns indicating that the same allele appeared as both top 1 somatic and germline indels from the same F0 founders. c, d Allele profiles generated by Reckh parents and progeny generated with the same crossing scheme as for the pleCC. c Tabulation of the Maternal-F cross. d Tabulation of the Paternal-F cross. e Crossing scheme forthe Reckh cage trials. Three individual cages were seeded with 10 homozygous Reckh females, 90 WT females and 100 WT males for the maternally initiated lineage, while the paternally initiated cages were seeded with 10 homozygous Reckh males, 90 WT males and 100 WT females. At each of the following three generations, 10 Reckh+ females and 10 Reckh+ males were randomly collected for single mosquito deep sequencing. f Biased inheritance of Reckh was observed in the maternally seeded cages at generations 2 and 3, but not for the paternally seeded cages. Pink bars denote the fraction of sequenced individual mosquitoes inheriting Reckh from female parents, and cyan colored bars represent Reckh inheritance from the males. Source data are provided as a Source Data file.

The Reckh element in mosquitoes performed similarly to the fly pleCC, however, Reckh F1 individuals displayed less frequent zygotic cleavage and a corresponding reduction in the diversity of resulting somatically generated mutations (>50% WT alleles remained, Paternal-F cross). Consistent with this limited number and array of somatic mutations in the F1 generation from Paternal-F cross, NHEJ mutations were only rarely transmitted to the F2 generation, probably due to more germline-restricted expression of vasa-Cas9 in mosquitoes as compared to flies (Fig.5c, d). These results again suggest that cleavage and repair events were generated later during development in paternal crosses resulting in a stochastic transmission of F1 somatic alleles to the germline, which were largely uncorrelated with the most prevalent allele present somatically in the F1 parent49. Taken together, these highly divergent sex-dependent DSB repair signatures suggested that such genetic fingerprints could be used to track parental history in the context of randomly mating multi-generation population cages.

Based on the highly dominant mutant indels (Maternal-F) versus WT (Paternal-F) alleles generated by Reckh genetic element described above, we evaluated inheritance patterns of indels in multi-generational cages initiated by a 5% introduction of Reckh into WT populations either through maternal or paternal lineages in the F0 generation (Fig.5e). We randomly selected at least 20 fluorescence marker-positive mosquitoes (10 females and 10 males) for NGS analysis at generations 2 and 3, when the Reckh allele was still present at relatively low frequencies in the population and random mating was more likely to have taken place between Reckh/+ heterozygous and WT mosquitoes. Thus, we envisioned that the source of Reckh allele could be tracked back to a male versus female parent of origin by examining whether a dominant WT allele was present (inherited paternally) or not (inherited maternally) (Fig.5e, f). Following this reasoning, we inferred a strong bias for progeny inheriting the Reckh element from a Reckh+ males mating with WT females during generations 2 and 3 than the reverse (i.e., female transmission of Reckh alleles) in the maternally seeded lineage. Indeed, in one maternally seeded replicate (cage 2, generation 3), 100% of the progeny had inherited the Reckh element from their fathers (Fig.5f). In contrast to the striking sex-specific transmission bias observed in maternally seeded cages, progeny from paternally seeded cages displayed more evenly distributed stochastic parental inheritance patterns (Fig.5f). These highly reproducible parent of origin signatures demonstrate the utility of ICP in allelic lineage tracking, which could be of great potential utility in evaluating alternative initial release strategies for gene-drive mosquitoes as well as post-release surveillance of gene-drives as they spread through wild target populations (see Discussion).

Another important challenge for deciphering DSB repair outcomes is to track both NHEJ and gene-cassette mediated HDRevents within the same sample. Such a comprehensive genetic detection tool could have broad impactful applications (see Discussion). For example, one important and non-trivial application is to follow the progress of gene-drives in a marker free fashion as they spread through insect populations. Such dual tracking capability would address the potential concern that mutations eliminating a dominant marker for the gene-drive element could evade phenotype-based assessments of the drive process. Accordingly, we devised a three-step short-amplicon based deep sequencing (200400bp) strategy based on tightly linked colony-specific nucleotide polymorphisms distinguishing donor versus receiver chromosomes to detect copying of two CopyCatcher elements, pleCC and hthCC, from their chromosomes of origin (donor chromosome) to WT homologous (receiver chromosome) targets (Fig.6a)49. Notably, this strategy only amplified the inserted gene cassette on the donor chromosome and or the cassette if it copied onto the receiver chromosome. Thus, the measured allelic frequencies indicate the relative proportions of gene cassettes copied to the receiver chromosome versus those residing on the donor chromosome (Fig.6b displays the inferred somatic HDR frequency quantified from the three-step NGS sequencing protocol as well as Indels quantified by our standard 2-step NGS sequencing protocol - see Methods section for additional details).

a Scheme for tracking gene-drive copying using NGS. Gray bars: genomic DNA, pink oval: Cas9 protein, sky-blue line: gRNA, colored asterisks: polymorphisms. Color coded rectangles represent four nucleotides. Four possible recombinants listed are generated by resolving Holliday junctions at different sites marked with black crosses. b NGS sequencing-based quantification of somatic HDR generated by pleCC in F1 progeny. Areas delineated by dotted lines indicate patches of cells in which somatic HDR copying events have taken place either under bright field (upper) or RFP fluorescent filed (middle). Bottom bars are the summary of the inferred frequency for the somatic HDR (orange), indels (green) and WT alleles (black) derived from the deep sequencing data using the same samples photographed above. More than three flies from each cross were imaged and used for analysis. Scale bars indicate 200 pixels. c Somatic HDR profile with ple gRNA. The red line is for Maternal-F cross and dark blue line for the Paternal-F cross. d Diagram of the hthCC. Black double arrow: recoded hth cDNA, blue rectangles: exon 1, light green rectangles: exons 2-14, and colored lines underneath represent probes used for detection. e In situ images with embryos laid from hthCC-vasa-Cas9 females crossed with WT males. Blue=exon 1, green=WT exons 2-14, red=recoded cDNA for exons 2-14. Insets are magnified single nuclei indicated by colored arrows. This experiment has been repeated at least three times. Scale bars stand for 10m. f Temporal profiles for somatic HDR-mediated copying of the hthCC element assessed by NGS as described for the pleCC in panels c and f. Y-axis tabulates the percentage of HDR at a given time point. Table at the bottom quantifies the HDR fraction at given time points for both the Paternal-F and Maternal-F crosses. Source data are provided as a Source Data file.

In our first set of experiments, we analyzed editing outcomes by examining F1 progeny derived from Maternal-S and Paternal-S pleCC crosses. We compared the rates of somatic HDR measured by NGS analysis to those evaluated by image-based phenotypes associated with copying of the CopyCatcher element. As summarized previously, CopyCatchers such as the pleCC are designed to permit quantification of concordant homozygous mutant clonal phenotypes (e.g., pale patches of thoracic cuticle and embedded sectors ofcolorless bristles), with underlying DsRed+ fluorescent cell phenotypes49. Individual flies in which imaging-based analysis had been conducted were then subject toseparate NGS HDR-fingerprinting and INDELs-fingerprinting resulting in a comprehensive quantification of HDR, NHEJ, and WT alleles within the same sample (Fig.6b, libraries for HDR-fingerprinting and INDELs-fingerprinting were prepared from the same individual fly, but with different DNA preparation and sequencing protocols as detailed description in Methods). For these experiments, F1 flies were genotyped and those carrying both Cas9 and pleCC gRNA were used for NGS analysis (data shown here are the inferred frequencies of somatic HDR, NHEJ events, and WT alleles). This dual integrated analysis revealed that HDR in the Maternal-S crosses resulted in ~15% somatic HDR-mediated cassette copying events on average based on sequencing, and that such cassette copying was yet more frequent in Paternal-S crosses, producing ~25% somatic HDR. The nearly two-fold greater HDR-mediated copying efficiency detected by sequencing in Paternal-S crosses mirrors phenotypic outcomes wherein maternally inherited Cas9 similarly results in a lower frequency of cassette copying detected by fluorescence image analysis in somatic cells than for paternally inherited Cas9 (Fig.6b)49.

Our genetic analysis of stage-dependent differences in DSB repair pathway activity in this study is consistent with a commonly held view in the gene-drive field based on a variety of indirect genetic transmission data that HDR-mediated cassette copying does not occur efficiently during early embryonic stages50,51,63,67,68,69,70. This inference, however, has not yet been verified experimentally. We thus sought to provide direct evidence supporting this key supposition using NGS-based HDR-fingerprinting to track the somatic HDR events across a range of developmental stages in both Maternal-F and Paternal-F crosses in which the Cas9 and gRNA transgenes are transmitted together either maternally or paternally using our validated NGS sequencing protocol. Notably, we collected samples at 9 timepoints and pooled 20 F1 progeny together for pooled sequencing to prime the developmental profile of somatic HDR with pleCC (samples were thus collected without genotyping since it is impractical to genotype individual embryos and young larvae). Because of the limitations imposed by embryo pooling we were unable to use the same samples collected here for also quantifying the generation of somatic NHEJ alleles (i.e., only half of the F1 progeny carried the vasa-Cas9 transgene on the X chromosome and those embryos lacking this transgene were not suitable for generating mutations - note that such an analysis was possible in the case of the viable Reckh drive shown in Fig.4e as well as for a viable split-drive allele inserted into the essential prosalpha2 locus shown in Supplementary Fig. S9). Indeed, NGS analysis detected only very rare examples of somatic HDR events in early embryos derived from both crosses (Fig.6c). Notably, HDR in the Paternal-F cross detected by this sequencing protocol increased substantially to 35.9% during adult stages, a period coinciding with the temporal peak of the pale expression profile (note that in this experiment we employed the actin-Cas9 rather than vasa-Cas9 source, which has higher level of Cas9 expression in somatic cells and generates a correspondingly higher frequency of somatic HDR)49.

We extended our sequencing-based strategy to quantify somatic HDR using a second CopyCatcher element (hthCC) designed specifically to identify even rare copying events in early blastoderm-stage embryos. The hthCC is inserted into the homothorax (hth) gene and was engineered to visualize HDR-mediated copying of the gene cassette by fluorescence in situ hybridization (FISH) using discriminating fluorescent RNA probes complementary to specific endogenous versus recoded cDNA sequences (Fig.6d, e). In this system, copying of the transgene from the donor chromosome to the receiver chromosome would be indicated by the presence of two nuclear dots of red fluorescence detected by the hth recoded cDNA-specific probe (indicating two copies of recoded hth cDNA). In contrast, cells in which no copying occurred should contain only a single nuclear red dot signal (from the donor allele). Such in situ analysis detected no clear case of gene cassette copying in any of the ~5000 blastoderm stage cells examined across ~500 embryos (with the caveat that some mitotic nuclei generate ambiguous signals depending on their orientation). This qualified negative result assessed by in situ analysis was consistent with the very low estimates of HDR frequency during the same early blastoderm-stage developmental window based on NGS analysis in staged time-course experiments, although the latter sequencing method did detect very low levels of somatic HDR at ~3hours after egg laying from the Paternal-F crosses (and no copying until day three of larvae with the maternal cross Fig.6df). The very low levels of somatic HDR observed in early embryos for the hthCC construct either by in situ hybridization or by NGS sequencing parallel the results summarized above for the pleCC element (Fig.6c, f). The maximal somatic HDR frequency observed for the hthCC Maternal-F crosses (0.06% at day 3 after egg laying) was somewhat lower than that for the similar cross for pleCC (0.35% at adult stage), consistent with the predominance of single mutant alleles being generated at very early stages following fertilization in Maternal-F crosses. In contrast to the exceedingly rare copying of the hthCC element detected in early embryos for either the Maternal-F or Paternal-F crosses, the same element frequently copied to the homologous chromosome during later developmental stages in Paternal-F crosses as assessed by NGS sequencing. The hthCC elementagain copied with somewhat lower efficiency than the pleCC element (e.g., 15.2% for hthCC versus 35.9% for pleCC tabulated in adults), presumably reflecting differing genomic cleavage rates or gene conversion efficiencies generated by their respective gRNAs (including total cleavage levels and temporal features). In aggregate, these two examples of quantitative analysis of copying frequencies based on both NGS and in situ analysis demonstrate that ICP and NGS-based quantification of gene conversion events can be successfully integrated for a comprehensive analysis of DSB repair outcomes, including both NHEJ and HDR events as a function of developmental stage. These powerful tools also could be applied for following gene-drive spread through freely mating populations in a marker-free manner as well as for a variety of other applications including gene therapy (see Discussion).

View post:
Developmental progression of DNA double-strand break repair deciphered by a single-allele resolution mutation ... - Nature.com

Archives