Libre Biotech

IsoSeq Clustering (Refine + Cluster)

Clustering: All 5 SMRT cells combined

Type
CWL
Status
succeeded
Engine
cwltool
Duration
0.4 h
Pipeline
IsoSeq Clustering (Refine + Cluster)
Run #86 (this run)
succeeded 5 sources
IsoSeq Annotation (Map + Collapse + SQANTI3)
Functional Annotation (TransDecoder + Pfam + SwissProt)

Workflow

IsoSeq Clustering (Refine + Cluster)

#cwl

Software Tools

Results Summary

Input CCS Reads
715,575
FLNC Reads
699,084
Mean FLNC Length
0 nt
HQ Isoforms
0
LQ Isoforms
0
Clustering Ratio
0.0
FLNC / HQ isoforms
Mean FL Support
9.4
reads per isoform
Total FL Reads
551,434

Output Files

clustered.cluster HPC 35.7 MB clustered.cluster_report.csv HPC 28.3 MB clustered.hq.bam HPC 50.7 MB clustered.hq.bam.pbi HPC 324.5 KB clustered.lq.bam HPC 13.9 KB clustered.lq.bam.pbi HPC 146 B demux_primers.lima.summary HPC 918 B flnc.bam HPC 1.44 GB flnc.bam.pbi HPC 7.2 MB flnc.filter_summary.report.json HPC 819 B job.yml HPC 270 B results_summary.json HPC 277 B

Provenance

Execution Expression quantification summary
Completed 2026-03-04T13:21:19+00:00
RO-Crate 1.1 Workflow RO-Crate 1.0 FAIR
This analysis is packaged as a Research Object Crate with machine-readable provenance and FAIR metadata.

RO-Crate Metadata (JSON-LD)

Show/hide raw JSON-LD
{
    "@context": "https://w3id.org/ro/crate/1.1/context",
    "@graph": [
        {
            "@id": "ro-crate-metadata.json",
            "@type": "CreativeWork",
            "about": {
                "@id": "./"
            },
            "conformsTo": [
                {
                    "@id": "https://w3id.org/ro/crate/1.1"
                },
                {
                    "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"
                }
            ]
        },
        {
            "@id": "./",
            "@type": "Dataset",
            "name": "IsoSeq Clustering (Refine + Cluster) \u2014 Run #86",
            "description": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data.",
            "datePublished": "2026-03-04",
            "license": {
                "@id": "https://creativecommons.org/licenses/by/4.0/"
            },
            "mainEntity": {
                "@id": "isoseq_clustering.cwl"
            },
            "hasPart": [
                {
                    "@id": "isoseq_clustering.cwl"
                },
                {
                    "@id": "job.yml"
                },
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                },
                {
                    "@id": "results_summary.json"
                },
                {
                    "@id": "summary_extractor.py"
                }
            ],
            "mentions": [
                {
                    "@id": "#execution"
                },
                {
                    "@id": "#summary-extraction"
                }
            ]
        },
        {
            "@id": "isoseq_clustering.cwl",
            "@type": [
                "File",
                "SoftwareSourceCode",
                "ComputationalWorkflow"
            ],
            "name": "IsoSeq Clustering (Refine + Cluster)",
            "description": "#cwl",
            "programmingLanguage": {
                "@id": "Generic IsoSeq3 clustering pipeline. Merges demultiplexed BAMs, runs primer removal + polyA filtering (refine), then clusters into HQ/LQ isoform consensus sequences. Compatible with Sequel I CCS (use_qvs=false) and Sequel II/IIe HiFi data."
            },
            "contentSize": "2.9 KB",
            "sha256": "3cd8cfcc8caaf0fb4a964a16fc02edffbddb048e0c46b8c633c8fd3abf7efa08"
        },
        {
            "@id": "#cwl",
            "@type": "ComputerLanguage",
            "name": "Common Workflow Language",
            "url": {
                "@id": "https://www.commonwl.org/"
            },
            "version": "1.2"
        },
        {
            "@id": "#cwltool",
            "@type": "SoftwareApplication",
            "name": "cwltool",
            "url": {
                "@id": "https://github.com/common-workflow-language/cwltool"
            }
        },
        {
            "@id": "job.yml",
            "@type": "File",
            "name": "job.yml",
            "description": "CWL job input parameters",
            "encodingFormat": "text/yaml",
            "contentSize": "270 B",
            "sha256": "effc1ef394d9974b677b4e09d3799df000828211ba3167994af64f8079c03670"
        },
        {
            "@id": "clustered.hq.bam",
            "@type": "File",
            "name": "clustered.hq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "50.7 MB",
            "sha256": "e08cdfeaa1f192004c8de849c953b738975ee8b722c2c7cfd46009243015eb9c"
        },
        {
            "@id": "clustered.lq.bam",
            "@type": "File",
            "name": "clustered.lq.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "13.9 KB",
            "sha256": "990e90c3d107b0e8ae021bb67949cc1b7b8b3abb222a761ee1507cb52bfe456d"
        },
        {
            "@id": "clustered.cluster_report.csv",
            "@type": "File",
            "name": "clustered.cluster_report.csv",
            "encodingFormat": "text/csv",
            "contentSize": "28.3 MB",
            "sha256": "6ab9fb217040b4dcddb86e04834d66d34cc598a7785a590448748bf63e31d453"
        },
        {
            "@id": "clustered.cluster",
            "@type": "File",
            "name": "clustered.cluster",
            "encodingFormat": "application/octet-stream",
            "contentSize": "35.7 MB",
            "sha256": "8b38425f3a5d814b2317c8b800096c1809f76427771ccb7a7b0535e8ea334a22"
        },
        {
            "@id": "flnc.filter_summary.report.json",
            "@type": "File",
            "name": "flnc.filter_summary.report.json",
            "encodingFormat": "application/json",
            "contentSize": "819 B",
            "sha256": "3cd90e4eed4300b99680cab63903076932e0e28b8b16e0f5ebee682bd7ec818c"
        },
        {
            "@id": "flnc.bam",
            "@type": "File",
            "name": "flnc.bam",
            "encodingFormat": "application/octet-stream",
            "contentSize": "1.44 GB",
            "sha256": "d84011cb9304a481724045dea13a0a06615c5992ef87dedc7e9c6e5c1cf8848b"
        },
        {
            "@id": "clustered.hq.bam.pbi",
            "@type": "File",
            "name": "clustered.hq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "324.5 KB",
            "sha256": "636c226d979130825401a93b67ab123b3e170940343b983e7306dd8f99f5be43"
        },
        {
            "@id": "demux_primers.lima.summary",
            "@type": "File",
            "name": "demux_primers.lima.summary",
            "encodingFormat": "application/octet-stream",
            "contentSize": "918 B",
            "sha256": "ddd291911666d31f2e6fa60077aa63cf46d48ca5d0cfe5e1599d29effceb18a6"
        },
        {
            "@id": "flnc.bam.pbi",
            "@type": "File",
            "name": "flnc.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "7.2 MB",
            "sha256": "6b724d81af58adcf240d2e9d867351cd9acae5d03a2c0dd2b67d212cfa183c9f"
        },
        {
            "@id": "clustered.lq.bam.pbi",
            "@type": "File",
            "name": "clustered.lq.bam.pbi",
            "encodingFormat": "application/octet-stream",
            "contentSize": "146 B",
            "sha256": "93dde54f7b40b778c0049366a8c6cf5f11536f2be5cb2f0d6276a6b9d8d6987e"
        },
        {
            "@id": "#execution",
            "@type": "CreateAction",
            "name": "IsoSeq Clustering (Refine + Cluster) execution",
            "instrument": {
                "@id": "isoseq_clustering.cwl"
            },
            "startTime": "2026-03-04T22:55:31+00:00",
            "endTime": "2026-03-04T13:21:04+00:00",
            "object": [
                {
                    "@id": "job.yml"
                }
            ],
            "result": [
                {
                    "@id": "clustered.hq.bam"
                },
                {
                    "@id": "clustered.lq.bam"
                },
                {
                    "@id": "clustered.cluster_report.csv"
                },
                {
                    "@id": "clustered.cluster"
                },
                {
                    "@id": "flnc.filter_summary.report.json"
                },
                {
                    "@id": "flnc.bam"
                },
                {
                    "@id": "clustered.hq.bam.pbi"
                },
                {
                    "@id": "demux_primers.lima.summary"
                },
                {
                    "@id": "flnc.bam.pbi"
                },
                {
                    "@id": "clustered.lq.bam.pbi"
                }
            ]
        },
        {
            "@id": "results_summary.json",
            "@type": "File",
            "name": "results_summary.json",
            "description": "Derived summary statistics from pipeline outputs (CPM >= 1, uniquely mapped reads)",
            "encodingFormat": "application/json",
            "contentSize": "277 B",
            "sha256": "a8e15878e2176d7e85572c7c614a0cee8b7d136a98ff6b27fac38e897625dbee"
        },
        {
            "@id": "summary_extractor.py",
            "@type": [
                "File",
                "SoftwareSourceCode"
            ],
            "name": "Summary extraction script",
            "description": "Python script that computed results_summary.json from pipeline outputs",
            "programmingLanguage": {
                "@id": "#python3"
            }
        },
        {
            "@id": "#python3",
            "@type": "ComputerLanguage",
            "name": "Python",
            "url": {
                "@id": "https://www.python.org/"
            },
            "version": "3"
        },
        {
            "@id": "#summary-extraction",
            "@type": "CreateAction",
            "name": "Expression quantification summary",
            "instrument": {
                "@id": "summary_extractor.py"
            },
            "endTime": "2026-03-04T13:21:19+00:00",
            "object": [
                {
                    "@id": "OUT.read_assignments.tsv.gz"
                },
                {
                    "@id": "OUT.gene_counts.tsv"
                },
                {
                    "@id": "OUT.transcript_counts.tsv"
                },
                {
                    "@id": "OUT.extended_annotation.gtf"
                },
                {
                    "@id": "OUT.transcript_models.gtf"
                }
            ],
            "result": [
                {
                    "@id": "results_summary.json"
                }
            ]
        }
    ]
}