Project Difficulty
Challenging
Target Audience
Computer Science, Computer Engineering or AI students
Description
This project challenges students to build an intelligent automation tool for porting software packages — for use in domains such as bioinformatic pipelines with Nextflow or statistics with R.
Although most top Bioconda packages now support Linux/Arm, there’s still a significant gap for native macOS on Apple Silicon, where many packages default to emulated x86 environments. Similarly, for the R community, Windows-on-Arm support for community created packages is lacking with many packages unable to build due to issues such as x86-specific code. Given the large number of community packages, applying manual patches is not only time-consuming but also inefficient, as many involve similar, repetitive adjustments—highlighting the need for a scalable, automated solution. The goal is to build a sophisticated system (beyond simple shell scripts) that uses dependency graph analysis, machine learning, to:
- Identify unported packages
- Trace recursive dependency issues
- Recommend or auto-generate build recipes and steps
- Evaluate build success and reattempt intelligently
- Generate pull requests when confident of a fix.
- For complex packages, offer guidance to developers on how to port them—for example, by suggesting tools like SSE2NEON for translating x86 SSE intrinsics.
- Be extensible to work with various packaging systems and languages
🔬 Students will gain practical experience with CI/CD systems, Python packaging with bioconda,
This project is an ideal blend of automation, machine learning, and systems programming — built with real-world impact in mind. The outcome could directly contribute to open source ecosystems and help bring cutting-edge bioinformatics tools to wider hardware audiences.
Hardware / Software Requirements
Hardware: Access to Apple Silicon either through the cloud or with Physical hardware Languages: Python, Bash, Nextflow Tooling: nf-core pipelines, Conda, BioConda, Docker/Singularity, Snakemake
Resources
Example Porting Script for Bioconda
Bioconductor Build Reports
Package installation results for CRAN and Bioconductor packages
Arm64 nf-core pipelines NCBI Datasets
Benefits
-
Standout projects could be internally referred for relevant positions at Arm!
-
If your submission is approved, you will receive a recognised badge that you can list on your CV and shared on LinkedIn. A great way to stand out from the crowd!
-
Problem-Solving Experience: Opportunity to debug and optimize bioinformatics software for emerging computing architectures.
-
Industry Relevance: Hands-on experience with Arm-based architectures, applicable to genomics research and cloud computing.