Skip to content

Issue with load_data dataloader: KeyError 'IR_VJ_1_v_call' #1

Description

@JiahaoChow

Hello, thank you for sharing your work on MIST. I am very interested in your project, and I am currently trying to reproduce your algorithm. However, I encountered some issues related to the dataloader.

I created the environment with the following command:

conda create -n mist python=3.10 -y && conda activate mist && pip install git+https://github.com/aapupu/MIST.git

Then I tried running the following code:

from mist.data import load_data

tcr_paths = [
    './data/raw/CD8+Tcells/donor1/vdj_v1_hs_aggregated_donor1_all_contig_annotations.csv', 
    './data/raw/CD8+Tcells/donor2/vdj_v1_hs_aggregated_donor2_all_contig_annotations.csv', 
    './data/raw/CD8+Tcells/donor3/vdj_v1_hs_aggregated_donor3_all_contig_annotations.csv', 
    './data/raw/CD8+Tcells/donor4/vdj_v1_hs_aggregated_donor4_all_contig_annotations.csv'
]

rna_paths = [
    './data/raw/CD8+Tcells/donor1/vdj_v1_hs_aggregated_donor1_filtered_feature_bc_matrix.h5',
    './data/raw/CD8+Tcells/donor2/vdj_v1_hs_aggregated_donor2_filtered_feature_bc_matrix.h5',
    './data/raw/CD8+Tcells/donor3/vdj_v1_hs_aggregated_donor3_filtered_feature_bc_matrix.h5',
    './data/raw/CD8+Tcells/donor4/vdj_v1_hs_aggregated_donor4_filtered_feature_bc_matrix.h5'
]

batches = ['donor1', 'donor2', 'donor3', 'donor4']

adata, *dataloader_tuple = load_data(
    rna_path=rna_paths, 
    tcr_path=tcr_paths, 
    batch=batches,
    rna_data_type='h5',
    tcr_data_type='10X',
)

However, I received the following error:

KeyError: 'IR_VJ_1_v_call'

The traceback points to this line in mist/data.py:

adata_tcr.obs['IR_VJ_1_v_call'] = adata_tcr.obs['IR_VJ_1_v_call'].astype('str')

It seems that the resultant tcr_adata has 0 features and no obs entries.
Do you think this could be due to a version mismatch (e.g., scirpy or related dependencies), or is there something else I might be missing?

Any guidance would be greatly appreciated.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions