职位详情-国家大学生就业服务平台

AI Kernel Performance Intern 领跑者计划
[兼职]
——
AI Kernel Performance Intern 领跑者计划

10.0-12.0K/月
|
本科及以上
|
招聘人数不限

专业不限

来源： boss直聘

310115

职位已下线

职位详情

领跑者计划是乐鑫为 2027 届海内外学生打造的转正储备实习项目，工作地点设在中国上海。 The Opportunity You will be the reason our chip is fast. You will write the hand-tuned kernels that power Large Language Models (LLMs) on our custom RISC-V hardware. You will work directly with hardware architects to exploit our proprietary Matrix (RVM) and Vector (RVV) extensions, squeezing every last FLOP out of the silicon. Key Responsibilities · Kernel Implementation: Write kernels for GEMM and common epilogues (bias/activation/quant); implement Softmax/RMSNorm; evolve toward attention kernels as the project matures. · Micro-Optimization: Analyze assembly output. Did the compiler unroll the loop? Did we stall on a memory load? You fix it. · Tiling & Layout: Calculate the optimal way to chop a large tensor into "tiles" that fit in our L1 cache/TCM. · Benchmarking: Build the "speedometer" for the chip. Prove your kernel is faster than the baseline. What We Will Teach You · Our proprietary RVM (Matrix) and RVV (Vector) intrinsic APIs. · How to use our cycle-accurate profilers and hardware counters. · The specific memory hierarchy constraints of our AI SoC. Must-Have Qualifications · Strong C/C++ skills, specifically with a math/logic focus. · Understanding of Computer Architecture basics: Registers, Cache Hierarchy (L1/L2), SIMD (Single Instruction Multiple Data). · Comfortable reading/writing technical documentation (Instruction Set Architecture specs). · Minimum 3 months, at least 4 days per week Nice-to-Have · Experience with CUDA, OpenMP, or AVX/Neon intrinsics. · Coursework in Linear Algebra or Numerical Methods.

乐鑫信息科技（上海）股份有限公司

所属行业
涉及领域 --
公司性质
公司规模 500-999人
公司网址 --
所在地址 nullnull