Release notes#
Version 1.12#
1.12.0.dev117+g2f4c8a4e 2025-06-12#
Breaking changes#
Adopt the Scientific Python deprecation schedule: remove Python 3.10 support and add Python 3.13 support, require anndata≥0.9 P Angerer (#3485)
Bug fixes#
Ensure
axis_nnzcalculates its chunk size/shape correctly withdaskI Gold (#3667)
Development Process#
Replaced several internal utilities with their
fast_array_utilscounterparts P Angerer (#3598)
Documentation#
Fix documentation location for
scanpy.settingsP Angerer (#3672)
Features#
Add
zarrsupport andconvert_strings_to_categoricalsparameter toscanpy.write()P Angerer (#3498)Add support for
scipy.sparse.csr_arrayandscipy.sparse.csc_arrayP Angerer (#3563)Added a new compressed parameter to the read_10x_mtx function to support reading uncompressed matrix files produced by tools like STARsolo. This parameter allows users to read uncompressed outputs from tools that don’t produce gzipped files by default. (#3564)
Miscellaneous improvements#
Deprecate
scanpy.tl.louvain(). P Angerer (#3658)
Version 1.11#
1.11.2 2025-05-28#
Bug fixes#
Fix zappy compatibility for clip_array P Angerer (#3351)
Fixes an error where
regress_outwould fail to work withintegertypes S Dicks (#3461)Prevent plotting with
mask_obsfrom mutating data V Menon (#3496)Prevent
scanpy.pp.scale()from creating a daskArraywithnumpy.matrixchunks P Angerer (#3597)Allow using
sklearn≥1.6, Dask ≥2024.8, andsphinx≥8.2.1 P Angerer (#3611)Fixed handling of
extargument inscanpy.read()I Gold #3643Fix error message when trying to use
sc.pp.pca(x, zero_center=False)with a sparse dask array. P Angerer (#3646)
Documentation#
Clarify use of implementations in
scanpy.pp.pca()docs. P Angerer (#3655)
Performance#
Speed up for a categorical regressor in
regress_out()S Dicks I Gold (#3353)In
pp.normalize_total, the median is now computed in-memory when using Dask S Dicks (#3379)Speed up
pp.normalize_totalwith a numba kernel forcsr-matricesS Dicks (#3571)
1.11.1 2025-03-31#
Bug fixes#
Features#
Allow
covariance_eighas a solver option forpca()withdask.array.Arraydense data ilan-gold (#3528)
Performance#
Speed up wilcoxon rank-sum test with numba G Wu (#3529)
1.11.0 2025-02-14#
Release candidates:
rc2 2025-01-24
rc1 2024-12-20
Features#
rc1
sample()supports both upsampling and downsampling of observations and variables.subsample()is now deprecated. G Eraslan & P Angerer (#943)rc1 Add
layerargument toscanpy.tl.score_genes()andscanpy.tl.score_genes_cell_cycle()L Zappia (#2921)rc1 Prevent
rawconflict withlayerinscore_genes()S Dicks (#3155)rc1 Add support for
medianas an aggregation function toaggregate(). This allows for median-based aggregation of data (e.g., pseudobulk), complementing existing methods like mean- and sum-based aggregation M Dehkordi (Farhad) (#3180)rc1 Add
key_addedargument topca(),tsne()andumap()P Angerer (#3184)rc1 Support running
scanpy.pp.pca()on sparse Dask arrays with the'covariance_eigh'solver P Angerer (#3263)rc1 Use upstreamed
PCAimplementation forcsr_arrayandcsr_matrix(see scikit-learn Version 1.4.0) P Angerer (#3267)rc1 Add explicit support to
scanpy.pp.pca()forsvd_solver='covariance_eigh'P Angerer (#3296)rc1 Add support for
dask.array.Arraytoscanpy.pp.calculate_qc_metrics()I Gold (#3307)rc1 Support
layerparameter inscanpy.pl.highest_expr_genes()P Angerer (#3324)rc1 Run numba functions single-threaded when called from inside of a
ThreadPoolP Angerer (#3335)rc1 Switch
print_header()andprint_versions()tosession_info2P Angerer (#3384)rc1 Add sampling probabilities/mask parameter
ptosample()P Angerer (#3410)
Performance#
rc1 Speed up
regress_out()P Ashish, P Angerer & S Dicks (#3284)
Documentation#
rc1 Improve
harmony_integrate()docs D Kühl (#3362)rc1 Raise
FutureWarningwhen calling deprecatedscanpy.ppfunctions P Angerer (#3380)rc1 P Angerer (#3407)
Bug fixes#
rc1 Upper-bound
sklearn<1.6.0due to dask/dask-ml#1002 Ilan Gold (#3393)rc2 Fix
rank_genes_groups()compatibility with data >10M cells P Angerer (#3426)rc2 Fix
scanpy.pl.rank_genes_groups()’saxparameter P Angerer (#3428)
Development Process#
rc2 Fix version number inference in development environments (CI and local) P Angerer (#3441)
Version 1.10#
1.10.4 2024-11-12#
Breaking changes#
Remove Python 3.9 support P Angerer (#3283)
Bug fixes#
Fix
scanpy.pl.DotPlot.style(),scanpy.pl.MatrixPlot.style(), andscanpy.pl.StackedViolin.style()resetting all non-specified parameters P Angerer (#3206)Accept
'group'instead of'obs'forstandard_scaleparameter instacked_violin()P Angerer (#3243)Use
density_norminstead of ofscale(cont. from #2844) inviolin()andstacked_violin()P Angerer (#3244)Switched all compatibility adapters for positional parameters to
FutureWarningP Angerer (#3264)Catch
PerfectSeparationWarningduringregress_out()J Wagner (#3275)Fix
scanpy.pp.highly_variable_genes()for batches of size 1 P Angerer (#3286)Fix
scanpy.pl.scatter()’scolorparameter to take collections as advertised P Angerer (#3299)Fix
scanpy.pl.highest_expr_genes()when used with a categorical gene symbol column P Angerer (#3302)
1.10.3 2024-09-17#
Bug fixes#
Prevent empty control gene set in
score_genes()M Müller (#2875)Fix
subset=Trueofhighly_variable_genes()whenflavorisseuratorcell_ranger, andbatch_key!=NoneE Roellin (#3042)Add compatibility with
numpy2.0 P Angerer #3065 and (#3115)Fix
legend_locargument inscanpy.pl.embedding()not accepting matplotlib parameters P Angerer (#3163)Fix dispersion cutoff in
highly_variable_genes()in presence ofNaNs P Angerer (#3176)Fix axis labeling for swapped axes in
rank_genes_groups_stacked_violin()Ilan Gold (#3196)Upper bound dask on account of scverse/anndata#1579 Ilan Gold (#3217)
The fa2-modified package replaces forceatlas2 for the latter’s lack of maintenance A Alam (#3220)
1.10.2 2024-06-25#
Development Process#
Add performance benchmarking #2977 R Shrestha, P Angerer
Documentation#
Bug fixes#
Compatibility with
matplotlib3.9 #2999 I VirshupAdd clear errors where
backedmode-like matrices (i.e., fromsparse_dataset) are not supported #3048 I goldWrite out full pca results when
_choose_representationis called i.e.,neighbors()withoutpca()#3078 I goldFix deprecated use of
.Awith sparse matrices #3084 P AngererFix zappy support #3089 P Angerer
Performance#
1.10.1 2024-04-09#
Documentation#
Added how-to example on plotting with Marsilea #2974 Y Zheng
Bug fixes#
Fix
aggregatewhen aggregating by more than two groups #2965 I Virshup
Performance#
1.10.0 2024-03-26#
scanpy 1.10 brings a large amount of new features, performance improvements, and improved documentation.
Some highlights:
Improved support for out-of-core workflows via
dask. See new tutorial: Using dask with Scanpy demonstrating counts-to-clusters for 1.4 million cells in <10 min.A new basic clustering tutorial demonstrating an updated workflow.
Opt-in increased performance for neighbor search and clustering (how to guide).
Ability to
maskobservations or variables from a number of methods (see Customizing Scanpy plots for an example with plotting embeddings)A new function
aggregate()for computing aggregations of your data, very useful for pseudo bulking!
Features#
scrublet()andscrublet_simulate_doublets()were moved fromscanpy.external.pptoscanpy.pp. Thescrubletimplementation is now maintained as part of scanpy #2703 P Angererscanpy.pp.pca(),scanpy.pp.scale(),scanpy.pl.embedding(), andscanpy.experimental.pp.normalize_pearson_residuals_pca()now support amaskparameter #2272 C Bright, T Marcella, & P AngererEnhanced dask support for some internal utilities, paving the way for more extensive dask support #2696 P Angerer
scanpy.pp.highly_variable_genes()supports dask for the defaultseuratandcell_rangerflavors #2809 P AngererNew function
scanpy.get.aggregate()which allows grouped aggregations over your data. Useful for pseudobulking! #2590 Isaac Virshup Ilan Gold Jon Bloomscanpy.pp.neighbors()now has atransformerargument allowing the use of different ANN/ KNN libraries #2536 P Angererscanpy.experimental.pp.highly_variable_genes()usingflavor='pearson_residuals'now uses numba for variance computation and is faster #2612 S Dicks & P Angererscanpy.tl.leiden()now offersigraph’s implementation of the leiden algorithm via viaflavorwhen set toigraph.leidenalg’s implementation is still default, but discouraged. #2815 I Goldscanpy.pp.highly_variable_genes()has new flavorseurat_v3_paperthat is in its implementation consistent with the paper description in Stuart et al 2018. #2792 E Roellinscanpy.datasets.blobs()now accepts arandom_stateargument #2683 E Roellinscanpy.pp.pca()andscanpy.pp.regress_out()now accept a layer argument #2588 S Dicksscanpy.pp.subsample()withcopy=Truecan now be called in backed mode #2624 E Roellinscanpy.external.pp.harmony_integrate()now runs with 64 bit floats improving reproducibility #2655 S Dicksscanpy.tl.rank_genes_groups()no longer warns that it’s default was changed from t-test_overestim_var to t-test #2798 L Heumosscanpy.pp.calculate_qc_metricsnow allowsqc_varsto be passed as a string #2859 N Teyssierscanpy.tl.leiden()andscanpy.tl.louvain()now store clustering parameters in the key provided by thekey_addedparameter instead of always writing to (or overwriting) a default key #2864 J Fanscanpy.pp.scale()now clipsnp.ndarrayalso at- max_valuefor zero-centering #2913 S DicksSupport sparse chunks in dask
scale(),normalize_total()andhighly_variable_genes()(seuratandcell-rangertested) #2856 ilan-gold
Documentation#
Doc style overhaul #2220 A Gayoso
Re-add search-as-you-type, this time via
readthedocs-sphinx-search#2805 P AngererFixed a lot of broken usage examples #2605 P Angerer
Improved harmonization of return field of
sc.ppandsc.tlfunctions #2742 E RoellinImproved docs for
percent_topargument ofcalculate_qc_metrics()#2849 I VirshupNew basic clustering tutorial (Preprocessing and clustering), based on one from scverse-tutorials #2901 I Virshup
Overhauled Tutorials page, and added new How to section to docs #2901 I Virshup
Added a new tutorial on working with dask (Using dask with Scanpy) #2901 I Gold I Virshup
Bug fixes#
Updated
read_visium()such that it can read spaceranger 2.0 files L LehnerFix
normalize_total()for dask #2466 P AngererFix setting :attr:
scanpy.settings.verbosityin some cases #2605 P AngererFix all remaining pandas warnings #2789 P Angerer
Fix some annoying plotting warnings around violin plots #2844 P Angerer
Scanpy now has a test job which tests against the minumum versions of the dependencies. In the process of implementing this, many bugs associated with using older versions of
pandas,anndata,numpy, andmatplotlibwere fixed. #2816 I VirshupFix warnings caused by internal usage of
pandas.DataFrame.stackwithpandas>=2.1#2864I Virshupscanpy.get.aggregate()now always returnsnumpy.ndarray#2893 S DicksRemoves self from array of neighbors for
use_approx_neighbors = Trueinscrublet()#2896S DicksCompatibility with scipy 1.13 #2943 I Virshup
Fix use of
dendrogram()on highly correlated low precision data #2928 P AngererFix pytest deprecation warning #2879 P Angerer
Development Process#
Deprecations#
Dropped support for Python 3.8. More details here. #2695 P Angerer
Deprecated specifying large numbers of function parameters by position as opposed to by name/keyword in all public APIs. e.g. prefer
sc.tl.umap(adata, min_dist=0.1, spread=0.8)oversc.tl.umap(adata, 0.1, 0.8)#2702 P AngererDropped support for
umap<0.5for performance reasons. #2870 P Angerer
Version 1.9#
1.9.8 2024-01-26#
Bug fixes#
Fix handling of numpy array palettes for old numpy versions #2832 P Angerer
1.9.7 2024-01-25#
Bug fixes#
Fix handling of numpy array palettes (e.g. after write-read cycle) #2734 P Angerer
Specify correct version of
matplotlibdependency #2733 P FisherFix
scanpy.pl.violin()usage ofseaborn.catplot#2739 E RoellinFix
scanpy.pp.highly_variable_genes()to handle the combinations ofinplaceandsubsetconsistently #2757 E RoellinReplace usage of various deprecated functionality from
anndataandpandas#2678 #2779 P AngererAllow to use default
n_top_geneswhen usingscanpy.pp.highly_variable_genes()flavor'seurat_v3'#2782 P AngererFix
scanpy.read_10x_mtx()’sgex_only=Truemode #2801 P Angerer
1.9.6 2023-10-31#
Bug fixes#
Allow
scanpy.pl.scatter()to accept astrpalette name #2571 P AngererMake
scanpy.external.tl.palantir()compatible with palantir >=1.3 #2672 DJ OttoFix
scanpy.pl.pca()whenreturn_fig=Trueandannotate_var_explained=True#2682 J WagnerTemp fix for #2680 by skipping
seabornversion 0.13.0 #2661 P AngererFix
scanpy.pp.highly_variable_genes()to not modify the used layer whenflavor=seurat#2698 E RoellinPrevent pandas from causing infinite recursion when setting a slice of a categorical column #2719 P Angerer
1.9.5 2023-09-08#
Bug fixes#
Remove use of deprecated
dtypeargument to AnnData constructor #2658 Isaac Virshup
1.9.4 2023-08-24#
Bug fixes#
Support scikit-learn 1.3 #2515 P Angerer
Deal with
Nonevalue vanishing from things like.uns['log1p']#2546 SP ShenDepend on
igraphinstead ofpython-igraph#2566 P Angererrank_genes_groups()now handles unsorted groups as intended #2589 S Dicksrank_genes_groups_df()now works forrank_genes_groups()withmethod="logreg"#2601 S Dicksscanpy.tl._utils._choose_representationnow works withn_pcsif bigger thansettings.N_PCS#2610 S Dicks
1.9.3 2023-03-02#
Bug fixes#
Variety of fixes against pandas 2.0.0rc0 #2434 I Virshup
1.9.2 2023-02-16#
Bug fixes#
highly_variable_genes()layerargument now works in tandem withbatches#2302 D Schaumonthighly_variable_genes()withflavor='cell_ranger'now handles the case in #2230 where the number of calculated dispersions is less thann_top_genes#2231 L ZappiaFix compatibility with matplotlib 3.7 #2414 I Virshup P Fisher
Fix scrublet numpy matrix compatibility issue #2395 A Gayoso
1.9.1 2022-04-05#
Bug fixes#
normalize_total()works when Dask is not installed #2209 R CannoodtFix embedding plots by bumping matplotlib dependency to version 3.4 #2212 I Virshup
1.9.0 2022-04-01#
Tutorials#
New tutorial on the usage of Pearson Residuals: How to preprocess UMI count data with analytic Pearson residuals J Lause, G Palla
Materials and recordings for Scanpy workshops by Maren Büttner
Experimental module#
Added
scanpy.experimentalmodule! Currently contains functionality related to pearson residuals inscanpy.experimental.pp#1715 J Lause, G Palla, I Virshup. This includes:normalize_pearson_residuals()for Pearson Residuals normalizationhighly_variable_genes()for HVG selection with Pearson Residualsnormalize_pearson_residuals_pca()for Pearson Residuals normalization and dimensionality reduction with PCArecipe_pearson_residuals()for Pearson Residuals normalization, HVG selection and dimensionality reduction with PCA
Features#
filter_rank_genes_groups()now allows to filter with absolute values of log fold change #1649 S Rybakov_choose_representationnow subsets the provided representation to n_pcs, regardless of the name of the provided representation (should affect mostlyneighbors()) #2179 I Virshup PG Majevscanpy.pp.scrublet()(and related functions) can now be used onAnnDataobjects containing multiple batches #1965 J ManningNumber of variables plotted with
pca_loadings()can now be controlled withn_pointsargument. Additionally, variables are no longer repeated if the anndata has less than 30 variables #2075 Yves33Dask arrays now work with
scanpy.pp.normalize_total()#1663 G Buckley, I Virshupembedding_density()now allows more than 10 groups #1936 A WolfEmbedding plots can now pass
colorbar_locto specify the location of colorbar legend, or passNoneto not show a colorbar #1821 A Schaar I VirshupEmbedding plots now have a
dimensionsargument, which lets users select which dimensions of their embedding to plot and uses the same broadcasting rules as other arguments #1538 I Virshupprint_versions()now usessession_info#2089 P Angerer I Virshup
Ecosystem#
Multiple packages have been added to our ecosystem page, including:
Bug fixes#
Fixed finding variables with
use_raw=Trueandbasis=Noneinscanpy.pl.scatter()#2027 E RiceFixed
scanpy.pp.scrublet()to address #1957 FlMai and ensure raw counts are used for simulationFunctions in
scanpy.datasetsno longer throwOldFormatWarningswhen usinganndata0.8#2096 I VirshupFixed use of
scanpy.pp.neighbors()withmethod='rapids': RAPIDS cuML no longer returns a squared Euclidean distance matrix, so we should not square-root the kNN distance matrix. #1828 M ZaslavskyRemoved
pytablesdependency by implementingread_10x_h5withh5pydue to installation errors on Windows #2064Fixed bug in
scanpy.external.pp.hashsolo()where default value was set improperly #2190 B ReizFixed bug in
scanpy.pl.embedding()functions where an error could be raised when there were missing values and large numbers of categories #2187 I Virshup
Version 1.8#
1.8.2 2021-11-3#
Documentation#
Update conda installation instructions #1974 L Heumos
Bug fixes#
Fix plotting after
scanpy.tl.filter_rank_genes_groups()#1942 S RybakovFix
use_raw=Noneusinganndata.AnnData.var_namesifanndata.AnnData.rawis present inscanpy.tl.score_genes()#1999 M KleinFix compatibility with UMAP 0.5.2 #2028 L Mcinnes
Fixed non-determinism in
scanpy.pl.paga()node positions #1922 I Virshup
Ecosystem#
Added PASTE (a tool to align and integrate spatial transcriptomics data) to scanpy ecosystem.
1.8.1 2021-07-07#
Bug fixes#
Fixed reproducibility of
scanpy.tl.score_genes(). Calculation and output is now float64 type. #1890 I KucinskiWorkarounds for some changes/ bugs in pandas 1.3 #1918 I Virshup
Fixed bug where
sc.pl.paga_comparecould mislabel nodes on the paga graph #1898 I VirshupFixed handling of
use_rawwithscanpy.tl.rank_genes_groups()#1934 I Virshup
1.8.0 2021-06-28#
Metrics module#
Added
scanpy.metricsmodule!Added
scanpy.metrics.gearys_c()for spatial autocorrelation #915 I VirshupAdded
scanpy.metrics.morans_i()for global spatial autocorrelation #1740 I Virshup, G PallaAdded
scanpy.metrics.confusion_matrix()for comparing labellings #915 I Virshup
Features#
Added
layerandcopykwargs tonormalize_total()#1667 I VirshupAdded
vcenterandnormarguments to the plotting functions #1551 G EraslanStandardized and expanded available arguments to the
sc.pl.rank_genes_groups*family of functions. #1529 F Ramirez I VirshupSee examples sections of
rank_genes_groups_dotplot()andrank_genes_groups_matrixplot()for demonstrations.
scanpy.tl.tsne()now supports the metric argument and records the passed parameters #1854 I Virshupscanpy.pl.scrublet_score_distribution()now uses same API as other scanpy functions for saving/ showing plots #1741 J Manning
Ecosystem#
Documentation#
Added rendered examples to many plotting functions #1664 A Schaar L Zappia bio-la L Hetzel L Dony M Buttner K Hrovatin F Ramirez I Virshup LouisK92 mayarali
Integrated DocSearch, a find-as-you-type documentation index search. #1754 P Angerer
Reorganized reference docs #1753 I Virshup
Clarified docs issues for
neighbors(),diffmap(),calculate_qc_metrics()#1680 G PallaFixed typos in grouped plot doc-strings #1877 C Rands
Extended examples for differential expression plotting. #1529 F Ramirez
See
rank_genes_groups_dotplot()orrank_genes_groups_matrixplot()for examples.
Bug fixes#
Fix
scanpy.pl.paga_path()TypeErrorwith recent versions of anndata #1047 P AngererFix detection of whether IPython is running #1844 I Virshup
Fixed reproducibility of
scanpy.tl.diffmap()(added random_state) #1858 I KucinskiFixed errors and warnings from embedding plots with small numbers of categories after
sns.set_palettewas called #1886 I VirshupFixed handling of
gene_symbolsargument in a number ofsc.pl.rank_genes_groups*functions #1529 F Ramirez I VirshupFixed handling of
use_rawforsc.tl.rank_genes_groupswhen no.rawis present #1895 I Virshupscanpy.pl.rank_genes_groups_violin()now works forraw=False#1669 M van den Beekscanpy.pl.dotplot()now usessmallest_dotargument correctly #1771 S Flemming
Development Process#
Switched to flit for building and deploying the package, a simple tool with an easy to understand command line interface and metadata #1527 P Angerer
Use pre-commit for style checks #1684 #1848 L Heumos I Virshup
Deprecations#
Dropped support for Python 3.6. More details here. #1897 I Virshup
Deprecated
layersandlayers_normkwargs tonormalize_total()#1667 I VirshupDeprecated
MulticoreTSNEbackend forscanpy.tl.tsne()#1854 I Virshup
Version 1.7#
1.7.2 2021-04-07#
Bug fixes#
scanpy.logging.print_versions()now works whenpython<3.8#1691 I Virshupscanpy.pp.regress_out()now usesjoblibas the parallel backend, and should stop oversubscribing threads #1694 I Virshupscanpy.pp.highly_variable_genes()withflavor="seurat_v3"now returns correct gene means and -variances when used withbatch_key#1732 J Lausescanpy.pp.highly_variable_genes()now throws a warning instead of an error when non-integer values are passed for method"seurat_v3". The check can be skipped by passingcheck_values=False. #1679 G Palla
Ecosystem#
1.7.1 2021-02-24#
Documentation#
More twitter handles for core devs #1676 G Eraslan
Bug fixes#
dendrogram()use1 - correlationas distance matrix to compute the dendrogram #1614 F RamirezFixed
obs_df()/var_df()erroring whenkeysnot passed #1637 I VirshupFixed argument handling for
scanpy.pp.scrublet()J ManningFixed passing of
kwargstoscanpy.pl.violin()whenstripplotwas also used #1655 M van den BeekFixed colorbar creation in
scanpy.pl.timeseries_as_heatmap#1654 M van den Beek
1.7.0 2021-02-03#
Features#
Add new 10x Visium datasets to
visium_sge()#1473 G PallaEnable download of source image for 10x visium datasets in
visium_sge()#1506 H SpitzerRefactor of
scanpy.pl.spatial(). Better support for plotting without an image, as well as directly providing images #1512 G PallaDict input for
scanpy.queries.enrich()#1488 G Eraslanrank_genes_groups_df()can now return fraction of cells in a group expressing a gene, and allows retrieving values for multiple groups at once #1388 G EraslanColor annotations for gene sets in
heatmap()are now matched to color for cluster #1511 L SikkemaPCA plots can now annotate axes with variance explained #1470 bfurtwa
Plots with
groupbyarguments can now group by values in the index by passing the index’s name (likepd.DataFrame.groupby). #1583 F RamirezAdded
na_colorandna_in_legendkeyword arguments toembedding()plots. Allows specifying color for missing or filtered values in plots likeumap()orspatial()#1356 I Virshupembedding()plots now support passingdictof{cluster_name: cluster_color, ...}for palette argument #1392 I Virshup
External tools (new)#
Add Scanorama integration to scanpy external API (
scanorama_integrate(), Hie et al. [2019]) #1332 B HieScrublet [Wolock et al., 2019] integration:
scrublet(),scrublet_simulate_doublets(), and plotting methodscrublet_score_distribution()#1476 J Manninghashsolo()for HTO demultiplexing [Bernstein et al., 2020] #1432 NJ BernsteinAdded scirpy (sc-AIRR analysis) to ecosystem page #1453 G Sturm
Added scvi-tools to ecosystem page #1421 A Gayoso
External tools (changes)#
Updates for
palantir()andpalantir_results()#1245 A MousaFixes to
harmony_timeseries()docs #1248 A MousaSupport for
leidenclustering byscanpy.external.tl.phenograph()#1080 A MousaDeprecate
scanpy.external.pp.scvi#1554 G XingUpdated default params of
sam()to work with larger data #1540 A Tarashansky
Documentation#
New contribution guide #1544 I Virshup
zshinstallation instructions #1444 P Angerer
Performance#
Speed up
read_10x_h5()#1402 P Weiler
Bugfixes#
Consistent fold-change, fractions calculation for filter_rank_genes_groups #1391 S Rybakov
Fixed bug where
score_geneswould error if one gene was passed #1398 I VirshupFixed
log1pinplace on integer dense arrays #1400 I VirshupFix docstring formatting for
rank_genes_groups()#1417 P WeilerRemoved
PendingDeprecationWarning`s from use of `np.matrix#1424 P WeilerFixed indexing byg in
~scanpy.pp.highly_variable_genes#1456 V BergenFix default number of genes for marker_genes_overlap #1464 MD Luecken
Fixed passing
groupbyanddendrogram_keytodendrogram()#1465 M VarmaFixed download path of
pbmc3k_processed#1472 D StroblBetter error message when computing DE with a group of size 1 #1490 J Manning
Update cugraph API usage for v0.16 #1494 R Ilango
Fixed
marker_gene_overlapdefault value fortop_n_markers#1464 MD LueckenPass
random_stateto RAPIDs UMAP #1474 C NoletFixed
anndataversion requirement forconcat()(re-exported from scanpy assc.concat) #1491 I VirshupFixed the width of the progress bar when downloading data #1507 M Klein
Updated link for
moignard15dataset #1542 I VirshupFixed bug where calling
set_figure_paramscould block if IPython was installed, but not used. #1547 I Virshupviolin()no longer fails if.rawnot present #1548 I Virshupspatial()refactoring and better handling of spatial data #1512 G Palla
Version 1.6#
1.6.0 2020-08-15#
This release includes an overhaul of dotplot(), matrixplot(), and stacked_violin() (#1210 F Ramirez), and of the internals of rank_genes_groups() (#1156 S Rybakov).
Overhaul of dotplot(), matrixplot(), and stacked_violin() #1210 F Ramirez#
An overhauled tutorial Core plotting functions.
New plotting classes can be accessed directly (e.g.,
DotPlot) or using thereturn_figparam.It is possible to plot log fold change and p-values in the
rank_genes_groups_dotplot()family of functions.Added
axparameter which allows embedding the plot in other images.Added option to include a bar plot instead of the dendrogram containing the cell/observation totals per category.
Return a dictionary of axes for further manipulation. This includes the main plot, legend and dendrogram to totals
Legends can be removed.
The
groupbyparam can take a list of categories, e.g.,groupby=[‘tissue’, ‘cell type’].Added padding parameter to
dotplotandstacked_violin. #1270Added title for colorbar and positioned as in dotplot for
matrixplot().dotplot()changes:Improved the colorbar and size legend for dotplots. Now the colorbar and size have titles, which can be modified using the
colorbar_titleandsize_titleparams. They also align at the bottom of the image and do not shrink if the dotplot image is smaller.Allow plotting genes in rows and categories in columns (
swap_axes).Using
DotPlot, thedot_edge_colorand line width can be modified, a grid can be added, and other modifications are enabled.A new style was added in which the dots are replaced by an empty circle and the square behind the circle is colored (like in matrixplots).
stacked_violin()changes:Violin colors can be colored based on average gene expression as in dotplots.
The linewidth of the violin plots is thinner.
Removed the tics for the y-axis as they tend to overlap with each other. Using the style method they can be displayed if needed.
Additions#
concat()is now exported from scanpy, see Concatenation for more info. #1338 I VirshupAdded highly variable gene selection strategy from Seurat v3 #1204 A Gayoso
Added
backup_urlparam toread_10x_h5()#1296 A GayosoAllow prefix for
read_10x_mtx()#1250 G SturmOptional tie correction for the
'wilcoxon'method inrank_genes_groups()#1330 S RybakovUse
sinfoforprint_versions()and addprint_header()to do what it previously did. #1338 I Virshup #1373
Bug fixes#
Avoid warning in
rank_genes_groups()if ‘t-test’ is passed #1303 A WolfRestrict sphinx version to <3.1, >3.0 #1297 I Virshup
Clean up
_ranksand fixdendrogramfor scipy 1.5 #1290 S RybakovUse
.rawto translate gene symbols if applicable #1278 E RiceFix
diffmap(#1262) G EraslanFix
neighborsinspring_project#1260 S RybakovBumped version requirement of
scipytoscipy>1.4to supportrmatmatargument ofLinearOperator#1246 I VirshupFix asymmetry of scores for the
'wilcoxon'method inrank_genes_groups()#754 S RybakovAvoid trimming of gene names in
rank_genes_groups()#753 S Rybakov
Version 1.5#
1.5.1 2020-05-21#
Bug fixes#
1.5.0 2020-05-15#
The 1.5.0 release adds a lot of new functionality, much of which takes advantage of anndata updates 0.7.0 - 0.7.2. Highlights of this release include support for spatial data, dedicated handling of graphs in AnnData, sparse PCA, an interface with scvi, and others.
Spatial data support#
Tutorials for basic analysis and integration with single cell data G Palla
read_visium()read 10x Visium data #1034 G Palla, P Angerer, I Virshupvisium_sge()load Visium data directly from 10x Genomics #1013 M Mirkazemi, G Palla, P Angerer
New functionality#
External tools#
Performance#
pca()now uses efficient implicit centering for sparse matrices. This can lead to signifigantly improved performance for large datasets #1066 A Tarashanskyscore_genes()now has an efficient implementation for sparse matrices with missing values #1196 redst4r.
Code design#
stacked_violin()can now be used as a subplot #1084 P Angererscore_genes()has improved logging #1119 G Eraslanscale()now saves mean and standard deviation in thevar#1173 A Wolfharmony_timeseries()#1091 A Mousa
Bug fixes#
combat()now works whenobs_namesaren’t unique. #1215 I Virshupscale()can now be used on dense arrays without centering #1160 simonwmregress_out()now works when some features are constant #1194 simonwmnormalize_total()errored if the passed object was a view #1200 I Virshupneighbors()sometimes ignored then_pcsparam #1124 V Bergenebi_expression_atlas()which contained some out-of-date URLs #1102 I Virshuphighly_variable_genes()which could lead to incorrect results when thebatch_keyargument was used #1180 G Eraslaningest()where an inconsistent number of neighbors was used #1111 S Rybakov
Version 1.4#
1.4.6 2020-03-17#
Functionality in external#
sam()self-assembling manifolds [Tarashansky et al., 2019] #903 A Tarashanskyharmony_timeseries()for trajectory inference on discrete time points #994 A Mousawishbone()for trajectory inference (bifurcations) #1063 A Mousa
Code design#
Bug fixes#
1.4.5 2019-12-30#
Please install scanpy==1.4.5.post3 instead of scanpy==1.4.5.
New functionality#
ingest()maps labels and embeddings of reference data to new data Integrating data using ingest and BBKNN #651 S Rybakov, A Wolfqueriesrecieved many updates including enrichment through gprofiler and more advanced biomart queries #467 I Virshupset_figure_params()allows settingfigsizeand acceptsfacecolor='white', useful for working in dark mode A Wolf
Code design#
downsample_countsnow always preserves the dtype of it’s input, instead of converting floats to ints #865 I Virshuprun neighbors on a GPU using rapids #830 T White
param docs from typed params P Angerer
embedding_density()now only takes one positional argument; similar forembedding_density(), which gains a paramgroupby#965 A Wolfwebpage overhaul, ecosystem page, release notes, tutorials overhaul #960 #966 A Wolf
Warning
changed default
solverinpca()fromautotoarpackchanged default
use_rawinscore_genes()fromFalsetoNone
1.4.4 2019-07-20#
New functionality#
scanpy.getadds helper functions for extracting data in convenient formats #619 I Virshup
Bug fixes#
Stopped deprecations warnings from AnnData
0.6.22I Virshup
Code design#
normalize_total()gains paramexclude_highly_expressed, andfractionis renamed tomax_fractionwith better docs A Wolf
1.4.3 2019-05-14#
Bug fixes#
neighbors()correctly infersn_neighborsagain fromparams, which was temporarily broken inv1.4.2I Virshup
Code design#
calculate_qc_metrics()is single threaded by default for datasets under 300,000 cells – allowing cached compilation #615 I Virshup
1.4.2 2019-05-06#
New functionality#
combat()supports additional covariates which may include adjustment variables or biological condition #618 G Eraslanhighly_variable_genes()has abatch_keyoption which performs HVG selection in each batch separately to avoid selecting genes that vary strongly across batches #622 G Eraslan
Bug fixes#
rank_genes_groups()t-test implementation doesn’t return NaN when variance is 0, also changed to scipy’s implementation #621 I Virshupumap()withinit_pos='paga'detects correctdtypeA Wolflouvain()andleiden()auto-generatekey_added=louvain_Rupon passingrestrict_to, which was temporarily changed in1.4.1A Wolf
Code design#
neighbors()andumap()got rid of UMAP legacy code and introduced UMAP as a dependency #576 S Rybakov
1.4.1 2019-04-26#
New functionality#
Scanpy has a command line interface again. Invoking it with
scanpy somecommand [args]callsscanpy-somecommand [args], except for builtin commands (currentlyscanpy settings) #604 P Angererebi_expression_atlas()allows convenient download of EBI expression atlas I Virshupmarker_gene_overlap()computes overlaps of marker genes M Lueckenfilter_rank_genes_groups()filters out genes based on fold change and fraction of cells expressing genes F Ramireznormalize_total()replacesnormalize_per_cell(), is more efficient and provides a parameter to only normalize using a fraction of expressed genes S Rybakovdownsample_counts()has been sped up, changed default value ofreplaceparameter toFalse#474 I Virshupembedding_density()computes densities on embeddings #543 M Lueckenpalantir()interfaces Palantir [Setty et al., 2019] #493 A Mousa
Code design#
.layerssupport of scatter plots F Ramirezfix double-logarithmization in compute of log fold change in
rank_genes_groups()A Muñoz-Rojasfix return sections of docs P Angerer
Version 1.3#
1.3.8 2019-02-05#
various documentation and dev process improvements
Added
combat()function for batch effect correction [Johnson et al., 2006, Leek et al., 2017, Pedersen, 2012] #398 M Lange
1.3.7 2019-01-02#
API changed from
import scanpy as sctoimport scanpy.api as sc.phenograph()wraps the graph clustering package Phenograph [Levine et al., 2015] thanks to A Mousa
1.3.6 2018-12-11#
Major updates#
a new plotting gallery for
visualizing-marker-genesF Ramireztutorials are integrated on ReadTheDocs,
pbmc3kandpaga-paul15A Wolf
Interactive exploration of analysis results through manifold viewers#
CZI’s cellxgene directly reads
.h5adfiles the cellxgene developersthe UCSC Single Cell Browser requires exporting via
cellbrowser()M Haeussler
Code design#
highly_variable_genes()supersedesfilter_genes_dispersion(), it gives the same results but, by default, expects logarithmized data and doesn’t subset A Wolf
1.3.5 2018-12-09#
uncountable figure improvements #369 F Ramirez
1.3.4 2018-11-24#
leiden()wraps the recent graph clustering package by Traag et al. [2019] K Polanskibbknn()wraps the recent batch correction package [Polański et al., 2019] K Polanskicalculate_qc_metrics()caculates a number of quality control metrics, similar tocalculateQCMetricsfrom Scater [McCarthy et al., 2017] I Virshup
1.3.3 2018-11-05#
Major updates#
a fully distributed preprocessing backend T White and the Laserson Lab
Code design#
read_10x_h5()andread_10x_mtx()read Cell Ranger 3.0 outputs #334 Q Gong
Note
Also see changes in anndata 0.6.
changed default compression to
Noneinwrite_h5ad()to speed up read and write, disk space use is usually less criticalperformance gains in
write_h5ad()due to better handling of strings and categories S Rybakov
1.3.1 2018-09-03#
RNA velocity in single cells [La Manno et al., 2018]#
Scanpy and AnnData support loom’s layers so that computations for single-cell RNA velocity [La Manno et al., 2018] become feasible S Rybakov and V Bergen
scvelo harmonizes with Scanpy and is able to process loom files with splicing information produced by Velocyto [La Manno et al., 2018], it runs a lot faster than the count matrix analysis of Velocyto and provides several conceptual developments
Plotting (Generic)#
There now is a section on imputation in external:#
magic()for imputation using data diffusion [van Dijk et al., 2018] #187 S Gigantedca()for imputation and latent space construction using an autoencoder [Eraslan et al., 2019] #186 G Eraslan
Version 1.2#
1.2.1 2018-06-08#
Plotting of Generic marker genes and quality control.#
highest_expr_genes()for quality control; plot genes with highest mean fraction of cells, similar toplotQCof Scater [McCarthy et al., 2017] #169 F Ramirez
1.2.0 2018-06-08#
Version 1.1#
1.1.0 2018-06-01#
set_figure_params()by default passesvector_friendly=Trueand allows you to produce reasonablly sized pdfs by rasterizing large scatter plots A Wolfdraw_graph()defaults to the ForceAtlas2 layout [Chippada, 2018, Jacomy et al., 2014], which is often more visually appealing and whose computation is much faster S Wollockscatter()also plots along variables axis MD Lueckenregress_out()is back to multiprocessing F Ramirezread()reads compressed text files G Eraslanmitochondrial_genes()for querying mito genes FG Brundumnn_correct()for batch correction [Haghverdi et al., 2018, Kang, 2018]phate()for low-dimensional embedding [Moon et al., 2019] S Gigantesandbag(),cyclone()for scoring genes [Fechtner, 2018, Scialdone et al., 2015]
Version 1.0#
1.0.0 2018-03-30#
Major updates#
Scanpy is much faster and more memory efficient: preprocess, cluster and visualize 1.3M cells in 6h, 130K cells in 14min, and 68K cells in 3min A Wolf
the API gained a preprocessing function
neighbors()and a classNeighbors()to which all basic graph computations are delegated A Wolf
Warning
Upgrading to 1.0 isn’t fully backwards compatible in the following changes
the graph-based tools
louvain()dpt()draw_graph()umap()diffmap()paga()require prior computation of the graph:sc.pp.neighbors(adata, n_neighbors=5); sc.tl.louvain(adata)instead of previouslysc.tl.louvain(adata, n_neighbors=5)install
numbaviaconda install numba, which replaces cythonthe default connectivity measure (dpt will look different using default settings) changed. setting
method='gauss'insc.pp.neighborsuses gauss kernel connectivities and reproduces the previous behavior, see, for instance in the example paul15.namings of returned annotation have changed for less bloated AnnData objects, which means that some of the unstructured annotation of old AnnData files is not recognized anymore
replace occurances of
group_bywithgroupby(consistency withpandas)it is worth checking out the notebook examples to see changes, e.g. the seurat example.
upgrading scikit-learn from 0.18 to 0.19 changed the implementation of PCA, some results might therefore look slightly different
Further updates#
UMAP [McInnes et al., 2018] can serve as a first visualization of the data just as tSNE, in contrast to tSNE, UMAP directly embeds the single-cell graph and is faster; UMAP is also used for measuring connectivities and computing neighbors, see
neighbors()A Wolfgraph abstraction: AGA is renamed to PAGA:
paga(); now, it only measures connectivities between partitions of the single-cell graph, pseudotime and clustering need to be computed separately vialouvain()anddpt(), the connectivity measure has been improved A Wolflogistic regression for finding marker genes
rank_genes_groups()with parametermethod='logreg'A Wolflouvain()provides a better implementation for reclustering viarestrict_toA Wolfscanpy no longer modifies rcParams upon import, call :func:
scanpy.set_figure_paramsto set the ‘scanpy style’ A Wolfdefault cache directory is
./cache/, setsettings.cachedirto change this; nested directories in this are avoided A Wolfshow edges in scatter plots based on graph visualization
draw_graph()andumap()by passingedges=TrueA Wolfdownsample_counts()for downsampling counts MD Lueckendefault
'louvain_groups'are called'louvain'A Wolf'X_diffmap'contains the zero component, plotting remains unchanged A Wolf
Version 0.4#
0.4.4 2018-02-26#
embed cells using
umap()[McInnes et al., 2018] #92 G Eraslanscore sets of genes, e.g. for cell cycle, using
score_genes()[Satija et al., 2015]: notebook
0.4.3 2018-02-09#
clustermap(): heatmap from hierarchical clustering, based onseaborn.clustermap()[Waskom et al., 2016] A Wolfonly return
matplotlib.axes.Axesin plotting functions ofsc.plwhenshow=False, otherwiseNoneA Wolf
0.4.2 2018-01-07#
amendments in PAGA and its plotting functions A Wolf
0.4.0 2017-12-23#
export to SPRING [Weinreb et al., 2017] for interactive visualization of data: spring tutorial S Wollock
Version 0.3#
0.3.2 2017-11-29#
finding marker genes via
rank_genes_groups_violin()improved, see #51 F Ramirez
0.3.0 2017-11-16#
AnnDatagains methodconcatenate()A WolfAnnDatais available as the separate anndata package P Angerer, A Wolfresults of PAGA simplified A Wolf
Version 0.2#
0.2.9 2017-10-25#
Initial release of the new trajectory inference method PAGA#
paga()computes an abstracted, coarse-grained (PAGA) graph of the neighborhood graph A Wolfpaga_compare()plot this graph next an embedding A Wolfpaga_path()plots a heatmap through a node sequence in the PAGA graph A Wolf
0.2.1 2017-07-24#
Scanpy includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells. A Wolf, P Angerer
Version 0.1#
0.1.0 2017-05-17#
Scanpy computationally outperforms and allows reproducing both the Cell Ranger R kit’s and most of Seurat’s clustering workflows. A Wolf, P Angerer