
Share your safe, non-confidential protein sequences and join us in creating a valuable open dataset to explore protein expression.

Recombinant protein expression is a foundation of modern biology, from academic discovery, and drug development, to bioproduct manufacturing.
Yet expression remains difficult to predict: a protein’s yield can depend on codon choice, genetic context, expression system, host state, and purification workflow. Most of this knowledge is still scattered across private experiments, failed attempts, and lab notebooks.
Ailurus Open Express is our effort to build a large-scale (at million-level) open dataset that makes protein expression more measurable, learnable, and useful for the community.
Ailurus is curating safe, soluble protein sequences from resources such as UniProt, combine selected coding sequences with the Ailurus vec expression vector library , and run a large parallel assays to measure relative intracellular soluble expression and purified-protein output using PandaPure.
We invite researchers, scientists, and biodevelopers to submit safe, non-confidential proteins of interest and help shape this open science dataset. Thus, the dataset will be built from two sources of sequence diversity, public protein databases and community-submitted proteins.
Relative soluble-expression signals for coding sequences tested across Ailurus vec genetic contexts.
Estimated protein amount, measured by Bradford absorbance, for selected samples purified by PandaPure.
All released data will be shared under the Ailurus Open License v0.1.
Submit non-confidential, safe proteins under 600 amino acids via the form below, or email your sequence list to support@ailurus.bio.
Help advance open science
We are especially looking for DNA synthesis, sequencing, biofoundry, and AIxBio ecosystem partners. Email us atsupport@ailurus.bio