IsoSeq Clustering (Refine + Cluster)

Clustering: DBA I-K2 (individual)

Download ZIP JSON-LD

Type

CWL

Status

succeeded

Engine

cwltool

Duration

0.2 h

Source Data

Study

Strain-specific cortex gene expression and isoform usage

Pipeline

PacBio CCS (Subreads → HiFi)

Run #75

succeeded

IsoSeq Clustering (Refine + Cluster)

Run #80 (this run)

succeeded 1 sources

IsoSeq Annotation (Map + Collapse + SQANTI3)

Run #90

succeeded 1 sources

Functional Annotation (TransDecoder + Pfam + SwissProt)

Run #100

succeeded 1 sources

Combined From

#75 — PacBio CCS (Subreads → HiFi) succeeded

Workflow

IsoSeq Clustering (Refine + Cluster)

#cwl

Software Tools

Tool	Version	URL
cwltool	-	https://github.com/common-workflow-language/cwltool

Results Summary

Input CCS Reads

103,951

FLNC Reads

101,688

Mean FLNC Length

0 nt

HQ Isoforms

LQ Isoforms

Clustering Ratio

0.0

FLNC / HQ isoforms

Mean FL Support

6.2

reads per isoform

Total FL Reads

64,817

Output Files

clustered.cluster HPC 4.2 MB clustered.cluster_report.csv HPC 3.3 MB clustered.hq.bam HPC 9.2 MB clustered.hq.bam.pbi HPC 59.8 KB clustered.lq.bam HPC 2.7 KB clustered.lq.bam.pbi HPC 102 B demux_primers.lima.summary HPC 915 B flnc.bam HPC 245.7 MB flnc.bam.pbi HPC 1.1 MB flnc.filter_summary.report.json HPC 819 B job.yml HPC 274 B results_summary.json HPC 276 B

Provenance

Execution	Expression quantification summary
Completed	2026-03-04T12:43:35+00:00

RO-Crate 1.1 Workflow RO-Crate 1.0 FAIR

This analysis is packaged as a Research Object Crate with machine-readable provenance and FAIR metadata.

RO-Crate Metadata (JSON-LD)

Show/hide raw JSON-LD

{
    "@context": "https://w3id.org/ro/crate/1.1/context",
    "@graph": [
        {
            "@id": "ro-crate-metadata.json",
            "@type": "CreativeWork",
            "about": {
                "@id": "./"
            },
            "conformsTo": [
                {
                    "@id": "https://w3id.org/ro/crate/1.1"
                },
                {
                    "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"
                }
            ]
        },
        {
            "@id": "./",
            "@type": "Dataset",
            "name": "IsoSeq Clustering (Refine + Cluster) \u2014 Run #80",
            "description": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data.",
            "datePublished": "2026-03-04",
            "license": {
                "@id": "https://creativecommons.org/licenses/by/4.0/"
            },
            "mainEntity": {
                "@id": "isoseq_clustering.cwl"
            },
            "hasPart": [
                {
                    "@id": "isoseq_clustering.cwl"
                },
                {
                    "@id": "job.yml"
                },
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                },
                {
                    "@id": "results_summary.json"
                },
                {
                    "@id": "summary_extractor.py"
                }
            ],
            "mentions": [
                {
                    "@id": "#execution"
                },
                {
                    "@id": "#summary-extraction"
                }
            ]
        },
        {
            "@id": "isoseq_clustering.cwl",
            "@type": [
                "File",
                "SoftwareSourceCode",
                "ComputationalWorkflow"
            ],
            "name": "IsoSeq Clustering (Refine + Cluster)",
            "description": "#cwl",
            "programmingLanguage": {
                "@id": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data."
            },
            "contentSize": "2.9 KB",
            "sha256": "3cd8cfcc8caaf0fb4a964a16fc02edffbddb048e0c46b8c633c8fd3abf7efa08"
        },
        {
            "@id": "#cwl",
            "@type": "ComputerLanguage",
            "name": "Common Workflow Language",
            "url": {
                "@id": "https://www.commonwl.org/"
            },
            "version": "1.2"
        },
        {
            "@id": "#cwltool",
            "@type": "SoftwareApplication",
            "name": "cwltool",
            "url": {
                "@id": "https://github.com/common-workflow-language/cwltool"
            }
        },
        {
            "@id": "job.yml",
            "@type": "File",
            "name": "job.yml",
            "description": "CWL job input parameters",
            "encodingFormat": "text/yaml",
            "contentSize": "274 B",
            "sha256": "250b963026e72188dfc0dcb3b0a2bf012fb3dcb6de00999ba5fc53baaf0983c2"
        },
        {
            "@id": "clustered.hq.bam",
            "@type": "File",
            "name": "clustered.hq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "9.2 MB",
            "sha256": "5485586217105b2d7f2feb6cbb40f927a9312c6e25736f535187b8b3bca08ff9"
        },
        {
            "@id": "clustered.lq.bam",
            "@type": "File",
            "name": "clustered.lq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "2.7 KB",
            "sha256": "d111fd5c598daf4d23ba68c1c0b1624223e3f1625abbfd891b13c43751560504"
        },
        {
            "@id": "clustered.cluster_report.csv",
            "@type": "File",
            "name": "clustered.cluster_report.csv",
            "encodingFormat": "text/csv",
            "contentSize": "3.3 MB",
            "sha256": "cae265a94651d321d96525a879cced611173bddb410327aee967edf663a8e807"
        },
        {
            "@id": "clustered.cluster",
            "@type": "File",
            "name": "clustered.cluster",
            "encodingFormat": "application/octet-stream",
            "contentSize": "4.2 MB",
            "sha256": "9c737f9dad2279283b5708abb9c6548079f3721177a6c5e0e8879d43c5189271"
        },
        {
            "@id": "flnc.filter_summary.report.json",
            "@type": "File",
            "name": "flnc.filter_summary.report.json",
            "encodingFormat": "application/json",
            "contentSize": "819 B",
            "sha256": "37fea9474179f259176e0ecf363af4c2110154716e121d8649a5d343f3364c35"
        },
        {
            "@id": "flnc.bam",
            "@type": "File",
            "name": "flnc.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "245.7 MB",
            "sha256": "f80be1716b6e00ff1f0638b3a595d673fe8a33cee5156e0f0be165186a299bd8"
        },
        {
            "@id": "clustered.hq.bam.pbi",
            "@type": "File",
            "name": "clustered.hq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "59.8 KB",
            "sha256": "8619dc4366577f6a49572ff2e5a55acc185231db9fc0de0bfad7cdfe02f65ae5"
        },
        {
            "@id": "demux_primers.lima.summary",
            "@type": "File",
            "name": "demux_primers.lima.summary",
            "encodingFormat": "application/octet-stream",
            "contentSize": "915 B",
            "sha256": "84edbd41258abd161dc369f702f11abb88980fdcf34e7d0ec5a87e0e46db520b"
        },
        {
            "@id": "flnc.bam.pbi",
            "@type": "File",
            "name": "flnc.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "1.1 MB",
            "sha256": "8e2622ced4f3c37cdc1490bc0493809cd235a34364a5fd7880947bb2e419b721"
        },
        {
            "@id": "clustered.lq.bam.pbi",
            "@type": "File",
            "name": "clustered.lq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "102 B",
            "sha256": "b4032f9e2c9f02f39fb73f14fc6cf34184e23decbafe541c700d3bc362bd3e41"
        },
        {
            "@id": "#execution",
            "@type": "CreateAction",
            "name": "IsoSeq Clustering (Refine + Cluster) execution",
            "instrument": {
                "@id": "isoseq_clustering.cwl"
            },
            "startTime": "2026-03-04T22:32:06+00:00",
            "endTime": "2026-03-04T12:43:20+00:00",
            "object": [
                {
                    "@id": "job.yml"
                }
            ],
            "result": [
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                }
            ]
        },
        {
            "@id": "results_summary.json",
            "@type": "File",
            "name": "results_summary.json",
            "description": "Derived summary statistics from pipeline outputs (CPM >= 1, uniquely mapped reads)",
            "encodingFormat": "application/json",
            "contentSize": "276 B",
            "sha256": "6c7924384ff6d3c4364820808b9dcdf3c33713ba963428da3847ee50dfbc91db"
        },
        {
            "@id": "summary_extractor.py",
            "@type": [
                "File",
                "SoftwareSourceCode"
            ],
            "name": "Summary extraction script",
            "description": "Python script that computed results_summary.json from pipeline outputs",
            "programmingLanguage": {
                "@id": "#python3"
            }
        },
        {
            "@id": "#python3",
            "@type": "ComputerLanguage",
            "name": "Python",
            "url": {
                "@id": "https://www.python.org/"
            },
            "version": "3"
        },
        {
            "@id": "#summary-extraction",
            "@type": "CreateAction",
            "name": "Expression quantification summary",
            "instrument": {
                "@id": "summary_extractor.py"
            },
            "endTime": "2026-03-04T12:43:35+00:00",
            "object": [
                {
                    "@id": "OUT.read_assignments.tsv.gz"
                },
                {
                    "@id": "OUT.gene_counts.tsv"
                },
                {
                    "@id": "OUT.transcript_counts.tsv"
                },
                {
                    "@id": "OUT.extended_annotation.gtf"
                },
                {
                    "@id": "OUT.transcript_models.gtf"
                }
            ],
            "result": [
                {
                    "@id": "results_summary.json"
                }
            ]
        }
    ]
}