/ L'annuaire des offres d'emploi en Suisse Romande
n/a n/a Genève CH
full-time

LLM Application Engineer (Full-Stack)

Entreprise
MERCURIA ENERGY TRADING SA
Lieu
Genève
Date de publication
18.06.2025
Référence
4876514

Description

Established in 2004, Mercuria is one of the leading integrated energy and commodity trading companies in the world. We bring energy markets together to support the needs of today by trading, structuring finance, and investing into strategic assets, while generating more than $110 billion in turnover.
Our operations span over 50 countries on 5 continents, including all the major energy hubs. We trade physical oil, energy products, environmental products and other commodities from Geneva, London, Singapore, Shanghai, Beijing, Dubai, Houston, Calgary and Greenwich (CT).
We are committed to advancing the transition to a more sustainable, affordable and reliable energy system for tomorrow. Over 50% of our assets are in low carbon & energy transition sectors - providing a strong platform to trade these new markets and support decarbonization.
In 2023, we established Silvania, a $500 million fund, investing into restoration and protection of nature and biodiversity globally and in support of the Paris Agreement goals & UN 30x30 biodiversity initiative. This new asset class supports nature protection and provides sustainable financial returns.

The role
We’re seeking an experienced LLM Application Engineer (Full-Stack) to design, develop, and deploy robust AI-powered chatbot applications tailored to the fast-paced needs of trading, middle office, and back office operations. This role demands full ownership of the application stack—from chat interfaces to secure APIs and legacy integration—working alongside data scientists and infrastructure engineers to deliver real-time, secure, and scalable AI services.
Transform large language models into a secure, low-latency chatbot service, production-ready for financial environments. You’ll architect and implement everything from UI and entitlements to backend APIs and observability dashboards, ensuring seamless integration into the firm's broader ecosystem.

Key Responsibilities
Chat UI Development :Build and maintain a responsive React/TypeScript-based interface supporting desktop and mobile, OAuth/OIDC SSO, chat and semantic search patterns.
API & Model Serving :Serve LLMs via FastAPI-based REST/gRPC endpoints; deploy models using Triton, Ray Serve, or SageMaker with GPU-aware autoscaling.
Security & Access Control: Enforce row-level entitlements using OAuth claims, implement prompt validation, rate limits, and comprehensive audit logging.
Legacy System Integration: Develop low-latency interfaces to interoperate with existing systems in Java/.NET, as well as FIX, Kafka, and message queues.
Monitoring & Cost Efficiency: Set up dashboards for latency, model accuracy, and Bedrock usage (Prometheus/Grafana); manage GitLab CI/CD pipelines for safe, blue-green deployments.
Infrastructure Automation: Contribute Terraform modules to provision and manage EKS clusters, API Gateways, Transit Gateways, and Lambda functions in coordination with platform engineers.

Technical expertise:
Languages & Frameworks: Python 3.x, FastAPI, LangChain/Haystack, React, TypeScript
Model Ops: Hugging Face, Bedrock SDK, Triton Inference Server, Ray Serve, AWS SageMaker
Authentication: OAuth2, OIDC, SSO
Data Stores: SQL, pgvector, Pinecone, OpenSearch
Infrastructure: Docker, Kubernetes/EKS, Terraform, GitLab CI
Observability: Prometheus, Grafana
Protocols: REST, gRPC, WebSockets, Server-Sent Events (SSE)
Fluent English

Non-technical Skills
4+ years of experience developing cloud-native, user-facing applications, ideally within trading or financial services.
Proven success in deploying LLMs or deep learning models in production environments.
Strong communication and demoing skills with the ability to present to both technical and executive stakeholders.
Thrive in a fast-moving team, combining rapid iteration with enterprise-grade reliability and security.

Déposer ma candidature

Choisir
Uniquement fichier pdf, Word ou OpenOffice. Taille maximum du fichier: 3 MB.