IsoSeq Clustering (Refine + Cluster)

Clustering: C57 K1-50pM (individual)

Download ZIP JSON-LD

Type

CWL

Status

succeeded

Engine

cwltool

Duration

0.2 h

Source Data

Study

Strain-specific cortex gene expression and isoform usage

Pipeline

PacBio CCS (Subreads → HiFi)

Run #76

succeeded

IsoSeq Clustering (Refine + Cluster)

Run #81 (this run)

succeeded 1 sources

IsoSeq Annotation (Map + Collapse + SQANTI3)

Run #91

succeeded 1 sources

Functional Annotation (TransDecoder + Pfam + SwissProt)

Run #101

succeeded 1 sources

Combined From

#76 — PacBio CCS (Subreads → HiFi) succeeded

Workflow

IsoSeq Clustering (Refine + Cluster)

#cwl

Software Tools

Tool	Version	URL
cwltool	-	https://github.com/common-workflow-language/cwltool

Results Summary

Input CCS Reads

139,403

FLNC Reads

136,489

Mean FLNC Length

0 nt

HQ Isoforms

LQ Isoforms

Clustering Ratio

0.0

FLNC / HQ isoforms

Mean FL Support

6.6

reads per isoform

Total FL Reads

91,987

Output Files

clustered.cluster HPC 6 MB clustered.cluster_report.csv HPC 4.7 MB clustered.hq.bam HPC 11.7 MB clustered.hq.bam.pbi HPC 80.2 KB clustered.lq.bam HPC 3.2 KB clustered.lq.bam.pbi HPC 101 B demux_primers.lima.summary HPC 917 B flnc.bam HPC 320 MB flnc.bam.pbi HPC 1.4 MB flnc.filter_summary.report.json HPC 819 B job.yml HPC 278 B results_summary.json HPC 276 B

Provenance

Execution	Expression quantification summary
Completed	2026-03-04T12:44:10+00:00

RO-Crate 1.1 Workflow RO-Crate 1.0 FAIR

This analysis is packaged as a Research Object Crate with machine-readable provenance and FAIR metadata.

RO-Crate Metadata (JSON-LD)

Show/hide raw JSON-LD

{
    "@context": "https://w3id.org/ro/crate/1.1/context",
    "@graph": [
        {
            "@id": "ro-crate-metadata.json",
            "@type": "CreativeWork",
            "about": {
                "@id": "./"
            },
            "conformsTo": [
                {
                    "@id": "https://w3id.org/ro/crate/1.1"
                },
                {
                    "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"
                }
            ]
        },
        {
            "@id": "./",
            "@type": "Dataset",
            "name": "IsoSeq Clustering (Refine + Cluster) \u2014 Run #81",
            "description": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data.",
            "datePublished": "2026-03-04",
            "license": {
                "@id": "https://creativecommons.org/licenses/by/4.0/"
            },
            "mainEntity": {
                "@id": "isoseq_clustering.cwl"
            },
            "hasPart": [
                {
                    "@id": "isoseq_clustering.cwl"
                },
                {
                    "@id": "job.yml"
                },
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                },
                {
                    "@id": "results_summary.json"
                },
                {
                    "@id": "summary_extractor.py"
                }
            ],
            "mentions": [
                {
                    "@id": "#execution"
                },
                {
                    "@id": "#summary-extraction"
                }
            ]
        },
        {
            "@id": "isoseq_clustering.cwl",
            "@type": [
                "File",
                "SoftwareSourceCode",
                "ComputationalWorkflow"
            ],
            "name": "IsoSeq Clustering (Refine + Cluster)",
            "description": "#cwl",
            "programmingLanguage": {
                "@id": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data."
            },
            "contentSize": "2.9 KB",
            "sha256": "3cd8cfcc8caaf0fb4a964a16fc02edffbddb048e0c46b8c633c8fd3abf7efa08"
        },
        {
            "@id": "#cwl",
            "@type": "ComputerLanguage",
            "name": "Common Workflow Language",
            "url": {
                "@id": "https://www.commonwl.org/"
            },
            "version": "1.2"
        },
        {
            "@id": "#cwltool",
            "@type": "SoftwareApplication",
            "name": "cwltool",
            "url": {
                "@id": "https://github.com/common-workflow-language/cwltool"
            }
        },
        {
            "@id": "job.yml",
            "@type": "File",
            "name": "job.yml",
            "description": "CWL job input parameters",
            "encodingFormat": "text/yaml",
            "contentSize": "278 B",
            "sha256": "3540d952c99745823a5ca9c99afa8bfed9ab0dd0cd11c4645b98d18e08a6eee1"
        },
        {
            "@id": "clustered.hq.bam",
            "@type": "File",
            "name": "clustered.hq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "11.7 MB",
            "sha256": "61ab93f801ae27449487614a30e44bbad9069354bc8f9785fb8121967b516e4b"
        },
        {
            "@id": "clustered.lq.bam",
            "@type": "File",
            "name": "clustered.lq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "3.2 KB",
            "sha256": "19532569d412792f29a313e8b637be479137820600776bbddb4a0d11ba155530"
        },
        {
            "@id": "clustered.cluster_report.csv",
            "@type": "File",
            "name": "clustered.cluster_report.csv",
            "encodingFormat": "text/csv",
            "contentSize": "4.7 MB",
            "sha256": "9396d55f689dd51f470905455f533c355c7d9da8c54a1931eeebaeecf976603f"
        },
        {
            "@id": "clustered.cluster",
            "@type": "File",
            "name": "clustered.cluster",
            "encodingFormat": "application/octet-stream",
            "contentSize": "6 MB",
            "sha256": "2be2e179192336f0ec8181acbdeecbe5dec0752ee04b6a5ae53bfe2bd30aa5b0"
        },
        {
            "@id": "flnc.filter_summary.report.json",
            "@type": "File",
            "name": "flnc.filter_summary.report.json",
            "encodingFormat": "application/json",
            "contentSize": "819 B",
            "sha256": "359fa32673fcaf99546875f69c63953afcf553e0b797d5f261ab75b6e2f81f1c"
        },
        {
            "@id": "flnc.bam",
            "@type": "File",
            "name": "flnc.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "320 MB",
            "sha256": "68e433638cc341627de25b20b2d0ed31abb80c6a71f2a9841b253ef39295a73c"
        },
        {
            "@id": "clustered.hq.bam.pbi",
            "@type": "File",
            "name": "clustered.hq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "80.2 KB",
            "sha256": "9827a82c50f9bc4720e170ff15a0857123a829d06a9f276fa6e87ce147bdf7e2"
        },
        {
            "@id": "demux_primers.lima.summary",
            "@type": "File",
            "name": "demux_primers.lima.summary",
            "encodingFormat": "application/octet-stream",
            "contentSize": "917 B",
            "sha256": "9dd0fdadf6a72d9ccc6f3f752c314dbc7d29f40ba77ae04c10f2bb17e265990f"
        },
        {
            "@id": "flnc.bam.pbi",
            "@type": "File",
            "name": "flnc.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "1.4 MB",
            "sha256": "f94cd3354d2aa668ea0ee0fa48a626b3777b8403f8ccec6bd53565cb3401e18d"
        },
        {
            "@id": "clustered.lq.bam.pbi",
            "@type": "File",
            "name": "clustered.lq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "101 B",
            "sha256": "e22c627e5eee1b83c2e989699a67c78a7b1ffc3331c184001ba2cfcc0b3c8540"
        },
        {
            "@id": "#execution",
            "@type": "CreateAction",
            "name": "IsoSeq Clustering (Refine + Cluster) execution",
            "instrument": {
                "@id": "isoseq_clustering.cwl"
            },
            "startTime": "2026-03-04T22:32:17+00:00",
            "endTime": "2026-03-04T12:43:56+00:00",
            "object": [
                {
                    "@id": "job.yml"
                }
            ],
            "result": [
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                }
            ]
        },
        {
            "@id": "results_summary.json",
            "@type": "File",
            "name": "results_summary.json",
            "description": "Derived summary statistics from pipeline outputs (CPM >= 1, uniquely mapped reads)",
            "encodingFormat": "application/json",
            "contentSize": "276 B",
            "sha256": "d0a0c230102c51d61d4b17bc8d5a3e86b9441c63e1397130d82e60a351dee246"
        },
        {
            "@id": "summary_extractor.py",
            "@type": [
                "File",
                "SoftwareSourceCode"
            ],
            "name": "Summary extraction script",
            "description": "Python script that computed results_summary.json from pipeline outputs",
            "programmingLanguage": {
                "@id": "#python3"
            }
        },
        {
            "@id": "#python3",
            "@type": "ComputerLanguage",
            "name": "Python",
            "url": {
                "@id": "https://www.python.org/"
            },
            "version": "3"
        },
        {
            "@id": "#summary-extraction",
            "@type": "CreateAction",
            "name": "Expression quantification summary",
            "instrument": {
                "@id": "summary_extractor.py"
            },
            "endTime": "2026-03-04T12:44:10+00:00",
            "object": [
                {
                    "@id": "OUT.read_assignments.tsv.gz"
                },
                {
                    "@id": "OUT.gene_counts.tsv"
                },
                {
                    "@id": "OUT.transcript_counts.tsv"
                },
                {
                    "@id": "OUT.extended_annotation.gtf"
                },
                {
                    "@id": "OUT.transcript_models.gtf"
                }
            ],
            "result": [
                {
                    "@id": "results_summary.json"
                }
            ]
        }
    ]
}