Welcome to CellAnn version 1.0 !

CellAnn is a web application for predicting cell types of single-cell clusters based on published reference datasets. CellAnn provides a comprehensive scRNA-seq reference database and users can easily find the relevant reference datasets in their analysis.

What's new:

2023-7-5 update Updated datasets.
2023-3-5 update Added Installation instructions for running CellAnn in local computer.
2023-3-1 update Added more datasets for Mouse blood.
2023-2-26 update Added bars to adjust the ranges of gene expression in Step4.
2023-2-23 bugfix Now users can stay on the website without losing their connection for up to 1 hour.
2023-1-17 update CellAnn version1.0 is online!

Database:

Human:

Mouse:

Workflow:

Citation:

Pin Lyu, Yijie Zhai et al. CellAnn: A comprehensive, super-fast, and user-friendly single-cell annotation web server[J]. bioRxiv.

Analysis

There are 4 major steps running cell type alignment:

Step1: Upload your average expression files

Step2: Select datasets from CellAlign database

Step3: Mapping your cluster to your selected datasets

Step4*: Visualize your results and generate your custom cell type annotation

*This step is optional

Step1.1 Upload average gene expression file:

Click Browse... to upload:

Browse...

Don't know the format?

Download an example of Human retina:

Step1 input

Preview your input:

Step1.1 Status:

Step2.1 Select dataset(s) from CellAlign database:

Choose species:

Choose organ type:

Step2.2 Review your selected dataset(s):

Step2 Status:

Step3.1 Run the analysis:

Click the button to run:

Advanced options*

Use background references:

Use

Not use

Set the background correlation cutoff ranges:

Set the references correlation cutoff ranges:

Running progress:

Running parameters:

Reference states:

Step3.2 Download the results:

Preview alignment results:

Download Results

Step3 Status:

Step4.1 Uploading your UMAP/TSNE file:

Click Browse... to upload (UMAP/TSNE):

Browse...

Don't know the format?

Download an example of Human retina (TSNE):

Step4 input

Preview:

Step4.1 Status:

Step4.2 Visualize:

Plotting alignment results and gene expression by study:

Choose a reference:

Choose a cluster in query dataset:

Set data range:

Step4.3 Add your custom annotation:

Double click the cells in 'Your custom label' column to edit

Remember click once out of the cells to save results before downloading

Download results

Install CellAnn Locally:

CellAnn is open source and you can run it on your own computer.

Installation instructions are available in our Github repository

Input Examples:

We provide example inputs generated from different studies:

1.Download example inputs from the Mouse Tabula Muris study here .

2.Download example inputs from the Human Pancreas studies here .

3.Download example inputs from the Human PBMC study here .

Frequently Asked Questions:

Q1: Before use CellAnn, how to prepare the input files?

for R seurat users:

1.Install devtools (for installing GitHub packages) if it isn't already installed.

### install devtools in R ###
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")

2.Prepare your Seurat object with “prepare_CellAnn” function.

### download scripts ###

devtools::source_url("https://raw.githubusercontent.com/Pinlyu3/CellAnn/main/prepare_CellAnn.R")

### prepare_CellAnn function:
### parameter: seurat_obj: your seurat obj
### parameter: folder(character): CellAnn ouput files will be output to this path, default is your current folder
### parameter: sample_name(character): names of this sample, such as "Liver_1", "eye2_2" etc.
### parameter: matrix_name(character): default will use the "RNA" counts matrix in your seurat object
### parameter: dims(character): "umap" or "tsne". default is 'umap'
### parameter: cluster(character): the name of the column which stored cluster information in the metadata of your Seurat object. default is 'seurat_clusters'

prepare_CellAnn(seurat_obj,folder=folder,sample_name='your_samples_name',matrix_name='RNA',dims='umap',cluster='seurat_clusters')

### After run prepare_CellAnn function, you will find 2 prepared files under your folder 

list.files(folder)

for Python scanpy users:

### load library and your query datasets: 
import scanpy as sc
import pandas as pd
adata = sc.read('data.h5ad')

### Calculate the average gene expression within each cluster
### we assume that the raw counts are stored in adata.raw.X
### the cluster information stored in adata.obs['cluster']: 
### the gene_names stored in adata.var_names
### the cell name stored in adata.obs_names
### the 2 dimension UMAPs strored in adata.obsm['X_umap']
### parameter: sample_name(character): names of this sample, such as "Liver_1"

def cluster_average(adata,sample_name):
  cluster_df = adata.obs['cluster']
  gene_names = adata.var_names
  cell_names = adata.obs_names
  mat_df=pd.DataFrame.sparse.from_spmatrix(adata.raw.X)
  ####
  row_annotations = np.array(cluster_df)
  unique_annotations = np.unique(row_annotations)
  ####
  merged_matrix = np.zeros((len(unique_annotations), mat_df.shape[1]))
  ####
  mat_df_mat = np.array(mat_df)
  for i, annotation in enumerate(unique_annotations):
    ###
    rows_to_sum = (row_annotations == annotation)
    rows_to_sum_mat = mat_df_mat[rows_to_sum,:]
    ###
    sum_rows = np.sum(rows_to_sum_mat, axis=0)
    sum_rows_norm = sum_rows / np.sum(sum_rows) * 1e5
    sum_rows_log = np.log(sum_rows_norm+1)
    ###
    merged_matrix[i, :] = sum_rows_log
  #### output to csv #####
  df_log = pd.DataFrame(merged_matrix.T)
  df_log.columns = unique_annotations
  df_log.index = gene_names
  df_log.insert(0,"GENE",df_log.index)
  ####
  df_log = df_log.round(3)
  Output_Step1_name = sample_name + '_CellAnn_Step1_input.txt'
  df_log.to_csv(Output_Step1_name, index=False,sep='\t')
  ####
  df_umap = pd.DataFrame(adata.obsm['X_umap'])
  df_umap.columns = ["dim1","dim2"]
  df_umap.insert(0,"cell",cell_names)
  df_umap.insert(1,"cluster",row_annotations)
  Output_Step4_name = sample_name + '_CellAnn_Step4_input.txt'
  df_umap.to_csv(Output_Step4_name, index=False,sep='\t')

### run cluster_average with your adata to get the input of step1 and step4:
cluster_average(adata,sample_name="your_sample_name")

Q2: How to use the CellAnn ?

Q3: How to add the CellAnn results to Seurat ?

1.You can integrate cellann outputs to seurat object in R with the function “merge_CellAnn_Seurat”.
2.Make sure you have installed “readr” library.

### download scripts ###

devtools::source_url("https://raw.githubusercontent.com/Pinlyu3/CellAnn/main/merge_CellAnn_Seurat.R")

### merge_CellAnn_Seurat function:
### parameter: seurat_obj: your seurat obj
### parameter: cluster(character): the name of the column which stored cluster information in the metadata of your Seurat object. default is 'seurat_clusters'
### parameter: cell_anno: the annotation file from CellAnn Step3 or Step4

Your_new_seurat <- merge_CellAnn_Seurat(seurat_obj=Your_old_seurat,cluster='seurat_clusters',cell_anno='step3orstep4_output.txt')

### After run merge_CellAnn_Seurat function, you will find the cell_anno has been merged into your seurat

head([email protected])

Q4: I cann't find my interested datasets in CellAnn.

Please contact us and provide the link (e.g. GEO number) to your datasets and related papers.

Issues using CellAnn?

CellAnn is currently in beta. If you find a bug, please report an issue on Github with the Bug Report form.

Contact us:

Please e-mail us if you have any other questions:

*required field