• AI Kernel Performance Intern 领跑者计划
  • [兼职]
  • ——
  • AI Kernel Performance Intern 领跑者计划
  • 10.0-12.0K/月
  • |
  • 本科及以上
  • |
  • 招聘 人数不限
专业不限
来源: boss直聘
2000846399
  • 310115
职位已下线

职位详情

领跑者计划是乐鑫为 2027 届海内外学生打造的转正储备实习项目,工作地点设在中国上海。 The Opportunity You will be the reason our chip is fast. You will write the hand-tuned kernels that power Large Language Models (LLMs) on our custom RISC-V hardware. You will work directly with hardware architects to exploit our proprietary Matrix (RVM) and Vector (RVV) extensions, squeezing every last FLOP out of the silicon. Key Responsibilities · Kernel Implementation: Write kernels for GEMM and common epilogues (bias/activation/quant); implement Softmax/RMSNorm; evolve toward attention kernels as the project matures. · Micro-Optimization: Analyze assembly output. Did the compiler unroll the loop? Did we stall on a memory load? You fix it. · Tiling & Layout: Calculate the optimal way to chop a large tensor into "tiles" that fit in our L1 cache/TCM. · Benchmarking: Build the "speedometer" for the chip. Prove your kernel is faster than the baseline. What We Will Teach You · Our proprietary RVM (Matrix) and RVV (Vector) intrinsic APIs. · How to use our cycle-accurate profilers and hardware counters. · The specific memory hierarchy constraints of our AI SoC. Must-Have Qualifications · Strong C/C++ skills, specifically with a math/logic focus. · Understanding of Computer Architecture basics: Registers, Cache Hierarchy (L1/L2), SIMD (Single Instruction Multiple Data). · Comfortable reading/writing technical documentation (Instruction Set Architecture specs). · Minimum 3 months, at least 4 days per week Nice-to-Have · Experience with CUDA, OpenMP, or AVX/Neon intrinsics. · Coursework in Linear Algebra or Numerical Methods.

乐鑫信息科技(上海)股份有限公司

  • 所属行业
  • 涉及领域 --
  • 公司性质
  • 公司规模 500-999人
  • 公司网址 --
  • 所在地址 nullnull

投诉
举报

意见
反馈

false
false