IsoSeq Clustering (Refine + Cluster)

Clustering: DBA K7 (individual)

Download ZIP JSON-LD

Type

CWL

Status

succeeded

Engine

cwltool

Duration

0.2 h

Source Data

Study

Strain-specific cortex gene expression and isoform usage

Pipeline

PacBio CCS (Subreads → HiFi)

Run #78

succeeded

IsoSeq Clustering (Refine + Cluster)

Run #83 (this run)

succeeded 1 sources

IsoSeq Annotation (Map + Collapse + SQANTI3)

Run #93

succeeded 1 sources

Functional Annotation (TransDecoder + Pfam + SwissProt)

Run #103

succeeded 1 sources

Combined From

#78 — PacBio CCS (Subreads → HiFi) succeeded

Workflow

IsoSeq Clustering (Refine + Cluster)

#cwl

Software Tools

Tool	Version	URL
cwltool	-	https://github.com/common-workflow-language/cwltool

Results Summary

Input CCS Reads

189,070

FLNC Reads

183,560

Mean FLNC Length

0 nt

HQ Isoforms

LQ Isoforms

Clustering Ratio

0.0

FLNC / HQ isoforms

Mean FL Support

7.5

reads per isoform

Total FL Reads

128,328

Output Files

clustered.cluster HPC 8.3 MB clustered.cluster_report.csv HPC 6.5 MB clustered.hq.bam HPC 12.3 MB clustered.hq.bam.pbi HPC 95.7 KB clustered.lq.bam HPC 623 B clustered.lq.bam.pbi HPC 65 B demux_primers.lima.summary HPC 874 B flnc.bam HPC 274.8 MB flnc.bam.pbi HPC 1.9 MB flnc.filter_summary.report.json HPC 819 B job.yml HPC 273 B results_summary.json HPC 277 B

Provenance

Execution	Expression quantification summary
Completed	2026-03-04T12:47:22+00:00

RO-Crate 1.1 Workflow RO-Crate 1.0 FAIR

This analysis is packaged as a Research Object Crate with machine-readable provenance and FAIR metadata.

RO-Crate Metadata (JSON-LD)

Show/hide raw JSON-LD

{
    "@context": "https://w3id.org/ro/crate/1.1/context",
    "@graph": [
        {
            "@id": "ro-crate-metadata.json",
            "@type": "CreativeWork",
            "about": {
                "@id": "./"
            },
            "conformsTo": [
                {
                    "@id": "https://w3id.org/ro/crate/1.1"
                },
                {
                    "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"
                }
            ]
        },
        {
            "@id": "./",
            "@type": "Dataset",
            "name": "IsoSeq Clustering (Refine + Cluster) \u2014 Run #83",
            "description": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data.",
            "datePublished": "2026-03-04",
            "license": {
                "@id": "https://creativecommons.org/licenses/by/4.0/"
            },
            "mainEntity": {
                "@id": "isoseq_clustering.cwl"
            },
            "hasPart": [
                {
                    "@id": "isoseq_clustering.cwl"
                },
                {
                    "@id": "job.yml"
                },
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                },
                {
                    "@id": "results_summary.json"
                },
                {
                    "@id": "summary_extractor.py"
                }
            ],
            "mentions": [
                {
                    "@id": "#execution"
                },
                {
                    "@id": "#summary-extraction"
                }
            ]
        },
        {
            "@id": "isoseq_clustering.cwl",
            "@type": [
                "File",
                "SoftwareSourceCode",
                "ComputationalWorkflow"
            ],
            "name": "IsoSeq Clustering (Refine + Cluster)",
            "description": "#cwl",
            "programmingLanguage": {
                "@id": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data."
            },
            "contentSize": "2.9 KB",
            "sha256": "3cd8cfcc8caaf0fb4a964a16fc02edffbddb048e0c46b8c633c8fd3abf7efa08"
        },
        {
            "@id": "#cwl",
            "@type": "ComputerLanguage",
            "name": "Common Workflow Language",
            "url": {
                "@id": "https://www.commonwl.org/"
            },
            "version": "1.2"
        },
        {
            "@id": "#cwltool",
            "@type": "SoftwareApplication",
            "name": "cwltool",
            "url": {
                "@id": "https://github.com/common-workflow-language/cwltool"
            }
        },
        {
            "@id": "job.yml",
            "@type": "File",
            "name": "job.yml",
            "description": "CWL job input parameters",
            "encodingFormat": "text/yaml",
            "contentSize": "273 B",
            "sha256": "c7e9f1eb0fe8fadc60cc7da5cddae44fbc6811a8bf8dc0337009928d6c8be10a"
        },
        {
            "@id": "clustered.hq.bam",
            "@type": "File",
            "name": "clustered.hq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "12.3 MB",
            "sha256": "f6d013acb9d60d5db6727b69be477535ddcd45abc1858ccafa2e90e3d93f07a2"
        },
        {
            "@id": "clustered.lq.bam",
            "@type": "File",
            "name": "clustered.lq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "623 B",
            "sha256": "46f6704324365c809bd5fbc7da588e7d646d9176c4ffed261163a30ed7baeba2"
        },
        {
            "@id": "clustered.cluster_report.csv",
            "@type": "File",
            "name": "clustered.cluster_report.csv",
            "encodingFormat": "text/csv",
            "contentSize": "6.5 MB",
            "sha256": "82f0c19aab62191f2e1b14dd08212aee2ca99aeaa6597d42863b6281bf147cad"
        },
        {
            "@id": "clustered.cluster",
            "@type": "File",
            "name": "clustered.cluster",
            "encodingFormat": "application/octet-stream",
            "contentSize": "8.3 MB",
            "sha256": "5ad40a0e9e2ba6366c98ed643270b8fc0b9b6ef67a38a6f36dec85eb66c62988"
        },
        {
            "@id": "flnc.filter_summary.report.json",
            "@type": "File",
            "name": "flnc.filter_summary.report.json",
            "encodingFormat": "application/json",
            "contentSize": "819 B",
            "sha256": "2271c40ea963b82a60466ca40765ada8debbeec83789eeddd9ff56f3232394fd"
        },
        {
            "@id": "flnc.bam",
            "@type": "File",
            "name": "flnc.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "274.8 MB",
            "sha256": "d7615c8f2839fc5eb0ab6542075c8d6037003ffa9a1f4af3e0f08d1f65a0f719"
        },
        {
            "@id": "clustered.hq.bam.pbi",
            "@type": "File",
            "name": "clustered.hq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "95.7 KB",
            "sha256": "c18ed7449f9e3816ee56650fb5af5c2cb1e90a454835dbd4b3ac3be4466993f6"
        },
        {
            "@id": "demux_primers.lima.summary",
            "@type": "File",
            "name": "demux_primers.lima.summary",
            "encodingFormat": "application/octet-stream",
            "contentSize": "874 B",
            "sha256": "491df6e09a295c032e77705c236ea7e1e4475cc30f6c82c19f953fb9d4ef35e1"
        },
        {
            "@id": "flnc.bam.pbi",
            "@type": "File",
            "name": "flnc.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "1.9 MB",
            "sha256": "01bb9ef535fa3fcd428d0c1916d7ced2f5d64cf11e02657170d89c2635ebef6f"
        },
        {
            "@id": "clustered.lq.bam.pbi",
            "@type": "File",
            "name": "clustered.lq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "65 B",
            "sha256": "0898b440b5d691c4384b9cadaf7b007cfb06b947826234896416987b491df3ad"
        },
        {
            "@id": "#execution",
            "@type": "CreateAction",
            "name": "IsoSeq Clustering (Refine + Cluster) execution",
            "instrument": {
                "@id": "isoseq_clustering.cwl"
            },
            "startTime": "2026-03-04T22:34:13+00:00",
            "endTime": "2026-03-04T12:47:06+00:00",
            "object": [
                {
                    "@id": "job.yml"
                }
            ],
            "result": [
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                }
            ]
        },
        {
            "@id": "results_summary.json",
            "@type": "File",
            "name": "results_summary.json",
            "description": "Derived summary statistics from pipeline outputs (CPM >= 1, uniquely mapped reads)",
            "encodingFormat": "application/json",
            "contentSize": "277 B",
            "sha256": "d069db8b4e9d16726b69125cf1d83049296e972e8f434d704a581be2f49fc36e"
        },
        {
            "@id": "summary_extractor.py",
            "@type": [
                "File",
                "SoftwareSourceCode"
            ],
            "name": "Summary extraction script",
            "description": "Python script that computed results_summary.json from pipeline outputs",
            "programmingLanguage": {
                "@id": "#python3"
            }
        },
        {
            "@id": "#python3",
            "@type": "ComputerLanguage",
            "name": "Python",
            "url": {
                "@id": "https://www.python.org/"
            },
            "version": "3"
        },
        {
            "@id": "#summary-extraction",
            "@type": "CreateAction",
            "name": "Expression quantification summary",
            "instrument": {
                "@id": "summary_extractor.py"
            },
            "endTime": "2026-03-04T12:47:22+00:00",
            "object": [
                {
                    "@id": "OUT.read_assignments.tsv.gz"
                },
                {
                    "@id": "OUT.gene_counts.tsv"
                },
                {
                    "@id": "OUT.transcript_counts.tsv"
                },
                {
                    "@id": "OUT.extended_annotation.gtf"
                },
                {
                    "@id": "OUT.transcript_models.gtf"
                }
            ],
            "result": [
                {
                    "@id": "results_summary.json"
                }
            ]
        }
    ]
}