IsoSeq Clustering (Refine + Cluster)
Clustering: All 2017 combined (4 SMRT cells, excl. K7 2019)
Type
CWL
Status
succeeded
Engine
cwltool
Duration
0.5 h
Source Data
Pipeline
PacBio CCS (Subreads → HiFi)
IsoSeq Clustering (Refine + Cluster)
Run #88
(this run)
succeeded
4 sources
IsoSeq Annotation (Map + Collapse + SQANTI3)
Functional Annotation (TransDecoder + Pfam + SwissProt)
Combined From
- #74 — PacBio CCS (Subreads → HiFi) succeeded
- #75 — PacBio CCS (Subreads → HiFi) succeeded
- #76 — PacBio CCS (Subreads → HiFi) succeeded
- #77 — PacBio CCS (Subreads → HiFi) succeeded
Workflow
IsoSeq Clustering (Refine + Cluster)
#cwl
Software Tools
| Tool | Version | URL |
|---|---|---|
| cwltool | - | https://github.com/common-workflow-language/cwltool |
Results Summary
Input CCS Reads
526,505
FLNC Reads
515,524
Mean FLNC Length
0 nt
HQ Isoforms
0
LQ Isoforms
0
Clustering Ratio
0.0
FLNC / HQ isoforms
Mean FL Support
8.8
reads per isoform
Total FL Reads
395,845
Output Files
Provenance
| Execution | Expression quantification summary |
| Completed | 2026-03-04T23:33:04+00:00 |
RO-Crate 1.1
Workflow RO-Crate 1.0
FAIR
This analysis is packaged as a Research Object Crate
with machine-readable provenance and FAIR metadata.
RO-Crate Metadata (JSON-LD)
Show/hide raw JSON-LD
{
"@context": "https://w3id.org/ro/crate/1.1/context",
"@graph": [
{
"@id": "ro-crate-metadata.json",
"@type": "CreativeWork",
"about": {
"@id": "./"
},
"conformsTo": [
{
"@id": "https://w3id.org/ro/crate/1.1"
},
{
"@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"
}
]
},
{
"@id": "./",
"@type": "Dataset",
"name": "IsoSeq Clustering (Refine + Cluster) \u2014 Run #88",
"description": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data.",
"datePublished": "2026-03-04",
"license": {
"@id": "https://creativecommons.org/licenses/by/4.0/"
},
"mainEntity": {
"@id": "isoseq_clustering.cwl"
},
"hasPart": [
{
"@id": "isoseq_clustering.cwl"
},
{
"@id": "job.yml"
},
{
"@id": "clustered.hq.bam"
},
{
"@id": "clustered.lq.bam"
},
{
"@id": "clustered.cluster_report.csv"
},
{
"@id": "clustered.cluster"
},
{
"@id": "flnc.filter_summary.report.json"
},
{
"@id": "flnc.bam"
},
{
"@id": "clustered.hq.bam.pbi"
},
{
"@id": "demux_primers.lima.summary"
},
{
"@id": "flnc.bam.pbi"
},
{
"@id": "clustered.lq.bam.pbi"
},
{
"@id": "results_summary.json"
},
{
"@id": "summary_extractor.py"
}
],
"mentions": [
{
"@id": "#execution"
},
{
"@id": "#summary-extraction"
}
]
},
{
"@id": "isoseq_clustering.cwl",
"@type": [
"File",
"SoftwareSourceCode",
"ComputationalWorkflow"
],
"name": "IsoSeq Clustering (Refine + Cluster)",
"description": "#cwl",
"programmingLanguage": {
"@id": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data."
},
"contentSize": "2.9 KB",
"sha256": "3cd8cfcc8caaf0fb4a964a16fc02edffbddb048e0c46b8c633c8fd3abf7efa08"
},
{
"@id": "#cwl",
"@type": "ComputerLanguage",
"name": "Common Workflow Language",
"url": {
"@id": "https://www.commonwl.org/"
},
"version": "1.2"
},
{
"@id": "#cwltool",
"@type": "SoftwareApplication",
"name": "cwltool",
"url": {
"@id": "https://github.com/common-workflow-language/cwltool"
}
},
{
"@id": "job.yml",
"@type": "File",
"name": "job.yml",
"description": "CWL job input parameters",
"encodingFormat": "text/yaml",
"contentSize": "262 B",
"sha256": "6de1d68f830c1f38a9f742b94846d277e58f4a4a36bf8753d5acbc16ad03e66a"
},
{
"@id": "clustered.hq.bam",
"@type": "File",
"name": "clustered.hq.bam",
"encodingFormat": "application/octet-stream",
"contentSize": "40.4 MB",
"sha256": "dbf2566d09cca8881a998c975d206690f2ec9662ebd0a63e84eb1d7c54e12862"
},
{
"@id": "clustered.lq.bam",
"@type": "File",
"name": "clustered.lq.bam",
"encodingFormat": "application/octet-stream",
"contentSize": "11.1 KB",
"sha256": "5d3d19f1f72709ae449dcfc2f4732534bc62f03f4de8b01c7f0b4f76392827e3"
},
{
"@id": "clustered.cluster_report.csv",
"@type": "File",
"name": "clustered.cluster_report.csv",
"encodingFormat": "text/csv",
"contentSize": "20.3 MB",
"sha256": "fa44cc69ba7593b97afbe83cb513c583f47e48db0c8085deb3c104ac25dc678d"
},
{
"@id": "clustered.cluster",
"@type": "File",
"name": "clustered.cluster",
"encodingFormat": "application/octet-stream",
"contentSize": "25.6 MB",
"sha256": "0d7e3b5f6445bfb202b516360f1772de34d6e26e9cf4488414934b2d281116b7"
},
{
"@id": "flnc.filter_summary.report.json",
"@type": "File",
"name": "flnc.filter_summary.report.json",
"encodingFormat": "application/json",
"contentSize": "819 B",
"sha256": "71fd48b8f5b7fd289bc86da83f55526861077dfa11abb2cb12baf5e658a78d2a"
},
{
"@id": "flnc.bam",
"@type": "File",
"name": "flnc.bam",
"encodingFormat": "application/octet-stream",
"contentSize": "1.17 GB",
"sha256": "39b5a12f4687162096070fad638d451eeb76b18472cffcacbe68f284539a1aeb"
},
{
"@id": "clustered.hq.bam.pbi",
"@type": "File",
"name": "clustered.hq.bam.pbi",
"encodingFormat": "application/octet-stream",
"contentSize": "251.6 KB",
"sha256": "d7c3107cc9dbbe13a0936ca382949fbb856aa02c7c7f75c71733162233f66d0e"
},
{
"@id": "demux_primers.lima.summary",
"@type": "File",
"name": "demux_primers.lima.summary",
"encodingFormat": "application/octet-stream",
"contentSize": "919 B",
"sha256": "d2dfbfe0254f57f152c21a8db55a091a09da96b1f28b71387a93f7276c3cf667"
},
{
"@id": "flnc.bam.pbi",
"@type": "File",
"name": "flnc.bam.pbi",
"encodingFormat": "application/octet-stream",
"contentSize": "5.3 MB",
"sha256": "807b4ce3b1b02153ec29c37417046df72491f9879116b6c99010fa60a606b409"
},
{
"@id": "clustered.lq.bam.pbi",
"@type": "File",
"name": "clustered.lq.bam.pbi",
"encodingFormat": "application/octet-stream",
"contentSize": "136 B",
"sha256": "e6c251563005ab94b67529c78aa08624be7208ff3229218e8040d1f88395a431"
},
{
"@id": "#execution",
"@type": "CreateAction",
"name": "IsoSeq Clustering (Refine + Cluster) execution",
"instrument": {
"@id": "isoseq_clustering.cwl"
},
"startTime": "2026-03-05T09:01:21+00:00",
"endTime": "2026-03-04T23:32:49+00:00",
"object": [
{
"@id": "job.yml"
}
],
"result": [
{
"@id": "clustered.hq.bam"
},
{
"@id": "clustered.lq.bam"
},
{
"@id": "clustered.cluster_report.csv"
},
{
"@id": "clustered.cluster"
},
{
"@id": "flnc.filter_summary.report.json"
},
{
"@id": "flnc.bam"
},
{
"@id": "clustered.hq.bam.pbi"
},
{
"@id": "demux_primers.lima.summary"
},
{
"@id": "flnc.bam.pbi"
},
{
"@id": "clustered.lq.bam.pbi"
}
]
},
{
"@id": "results_summary.json",
"@type": "File",
"name": "results_summary.json",
"description": "Derived summary statistics from pipeline outputs (CPM >= 1, uniquely mapped reads)",
"encodingFormat": "application/json",
"contentSize": "277 B",
"sha256": "25712d9396be9d68b626c271171cec06c057263311425b27ffbde6fe05653111"
},
{
"@id": "summary_extractor.py",
"@type": [
"File",
"SoftwareSourceCode"
],
"name": "Summary extraction script",
"description": "Python script that computed results_summary.json from pipeline outputs",
"programmingLanguage": {
"@id": "#python3"
}
},
{
"@id": "#python3",
"@type": "ComputerLanguage",
"name": "Python",
"url": {
"@id": "https://www.python.org/"
},
"version": "3"
},
{
"@id": "#summary-extraction",
"@type": "CreateAction",
"name": "Expression quantification summary",
"instrument": {
"@id": "summary_extractor.py"
},
"endTime": "2026-03-04T23:33:04+00:00",
"object": [
{
"@id": "OUT.read_assignments.tsv.gz"
},
{
"@id": "OUT.gene_counts.tsv"
},
{
"@id": "OUT.transcript_counts.tsv"
},
{
"@id": "OUT.extended_annotation.gtf"
},
{
"@id": "OUT.transcript_models.gtf"
}
],
"result": [
{
"@id": "results_summary.json"
}
]
}
]
}