Datasets

Quanto RDF

Raw sequence data quality statistics calculated from the Sequence Read Archive data

Dataset specifications

Tags
Other
Provenance Original
Registration Submitted
Data provider
  • Database Center for Life Science
Creator
  • Tazro Inutano OhtaDatabase Center for Life Science
Issued 2016-07-12
Licenses
  • Attribution 4.0 International (CC BY 4.0)
Version 0.1.2
Download https://rdfportal.org/download/quanto
SPARQL Endpoint https://rdfportal.org/primary/sparql

Dataset statistics

Triples
107782639
Subjects
21955729
Properties
29
Objects
31484031
Classes
8

SPARQL example queries

Example 1

Run on Endpoint
# Retrieve statistics of SRA entry ERR026579 from the Qunato database

PREFIX sos: <http://purl.jp/bio/01/quanto/ontology/sos#>
PREFIX quanto: <http://purl.jp/bio/01/quanto/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX pav: <http://purl.org/pav/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT
  ?quanto
  ?quanto_id
  ?encoding
  ?file_type
  ?version
  ?fastqc_version
  ?min_seq_len
  ?median_seq_len
  ?max_seq_len
  ?mean_bc_quality
  ?median_bc_quality
  ?n_content
  ?gc_content
  ?total_seq
FROM <http://quanto.dbcls.jp>
WHERE {
  ?quanto a sos:SequenceStatisticsReport .
  ?quanto dct:identifier ?quanto_id .
  ?quanto rdfs:seeAlso <http://identifiers.org/insdc.sra/ERR026579> .
  ?quanto sos:fastqcVersion ?fastqc_version .
  ?quanto sos:encoding ?encoding .
  ?quanto sos:fileType ?file_type .
  ?quanto pav:version ?version .
  ?quanto sos:maxSequenceLength ?bk2 .
  ?bk2 rdf:value ?max_seq_len .
  ?quanto sos:medianSequenceLength ?bk3 .
  ?bk3 rdf:value ?median_seq_len .
  ?quanto sos:minSequenceLength ?bk4 .
  ?bk4 rdf:value ?min_seq_len .
  ?quanto sos:overallMeanBaseCallQuality ?bk5 .
  ?bk5 rdf:value ?mean_bc_quality .
  ?quanto sos:overallMedianBaseCallQuality ?bk6 .
  ?bk6 rdf:value ?median_bc_quality .
  ?quanto sos:overallNContent ?bk7 .
  ?bk7 rdf:value ?n_content .
  ?quanto sos:percentGC ?bk8 .
  ?bk8 rdf:value ?gc_content .
  ?quanto sos:totalSequences ?bk9 .
  ?bk9 rdf:value ?total_seq .
}
LIMIT 1

Example 2

Run on Endpoint
# Retrieve top 50 high-throughput sequencing runs with filtering by base call quality and read length

PREFIX sos: <http://purl.jp/bio/01/quanto/ontology/sos#>
PREFIX quanto: <http://purl.jp/bio/01/quanto/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX pav: <http://purl.org/pav/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?id ?insdcId ?median_bc_quality ?total_sequences ?median_seq_length
FROM <http://quanto.dbcls.jp>
WHERE {
  ?quanto dct:identifier ?id;
  rdfs:seeAlso ?insdcId;
  sos:totalSequences ?x1;
  sos:overallMedianBaseCallQuality ?x2;
  sos:medianSequenceLength ?x3.

  ?x1 rdf:value ?total_sequences .
  ?x2 rdf:value ?median_bc_quality .
  ?x3 rdf:value ?median_seq_length .

  FILTER(?median_bc_quality > 30).
  FILTER(?median_seq_length > 70).
}
ORDER BY DESC(?total_sequences)
LIMIT 50

Example 3

Run on Endpoint
# Calculate average throughput for long read sequencing data

PREFIX sos: <http://purl.jp/bio/01/quanto/ontology/sos#>
PREFIX quanto: <http://purl.jp/bio/01/quanto/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX pav: <http://purl.org/pav/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (AVG(?total_sequences) AS ?average_throughput)
FROM <http://quanto.dbcls.jp>
WHERE {
  ?quanto dct:identifier ?id;
    sos:totalSequences ?x1;
    sos:overallMedianBaseCallQuality ?x2;
    sos:medianSequenceLength ?x3 .
  ?x1 rdf:value ?total_sequences .
  ?x2 rdf:value ?median_bc_quality .
  ?x3 rdf:value ?median_seq_length .
  FILTER(?median_seq_length > 500).
}

Schema diagram

Schema diagram for quanto
Schema diagram for quanto