RNAseq(スプライスバリアント)処理手法の比較尺度

  • スプライシング・バリエーションの評価に関して複数の方法をさまざまな尺度で比較しているペイパーこちら
  • その尺度の詳細は、リンク先論文のサプルのさらにサプル・ノートの最後の最後の45ページ目に色々あります。
    • Alignment yield
      • "We measured the proportion of sequenced (or simulated) reads that were mapped and the frequency of ambiguous mappings"
    • Mismatch and truncation frequencies
      • "the number of mismatches (substitutions) per primary read alignment and visualized the resulting distributions"
    • Truncation frequency
      • "Some aligners can truncate the ends of reads, and thus output a partial alignment when unable to map an entire sequence. ... A good spliced aligner would therefore be expected to output a moderate proportion of truncated alignments."
    • Basewise accuracy
      • "the most fundamental. Here, we measured the proportion of all simulated bases that were correctly mapped, and the proportion incorrectly mapped" スプライシングの影響を受けるところとそうでないところで条件付けもして集計
    • Read placement accuracy
      • "In addition to basewise accuracy, it is important to measure performance at the read level"
    • Accuracy among unique and ambiguous mappings
      • "By comparing accuracy between unique and ambiguous mappings, a level of confidence can be established for each category "
    • Indel frequency and accuracy
      • " the capability to detect indels therefore differs markedly among mappers ... the number of insertions and deletions in the primary alignments from each protocol"
    • Spatial distribution of mismatches, indels and splices over read sequences
      • " biases may result in the distribution of alignment features"
    • Coverage of annotated genes
      • "number of exon hits (alignments covering only exonic features), spliced exon hits (as the previous category, but aligning with a splice operation), partial exon hits (alignments covering exonic and non-exonic features), intron hits, intergenic hits, number of genes with proper exon hits, proportion of exon hits and the number of alignments associated with specific types of features"
    • Splice frequency and junction characteristics
      • "the number of reported splices divided by the number of sequenced reads"
    • Splice accuracy
      • " A splice was considered correct if placed so that its genomic start and end (donor and acceptor) coordinates agreed with those of the true alignment"
    • Junction frequency and accuracy
      • "A junction was considered correct if its genomic start and end coordinates matched those of a junction in the simulated transcriptome"
    • Transcript reconstruction accuracy
      • "we ran Cufflinks on the output from each alignment protocol, and computed precision and recall for reconstruction of individual exons as well as spliced transcripts"
  • この論文の結果を利用するには、多尺度のスコアを「自分の決断用1次元」に射影するための関数を作らなければなりませんね。