Data Engineer
Our client is an AI company bringing responsible technology to the legal field, focused on developing trustworthy retrieval systems for high-impact, safety-critical use cases.
We usually respond within a day
🌟 Opportunity : Data Engineer
Our client is an early-stage AI startup developing cutting-edge technology to bring responsible AI into the legal sector. With a strong focus on safety, transparency, and verifiability, they're building next-generation retrieval systems to support legal professionals in their daily workflows.
As a Data Engineer you'll architect and own the data infrastructure that powers Affine's legal research platform. You'll build systems that lawyers rely on daily for critical research, ensuring data is always current and accurate.
🚀 Responsibilities :
- Build custom API integration with law firm document management systems (iManage, SharePoint, NetDocuments), mirroring complex permissions structures and syncing in real-time with change detection
- Integrate new legal data sources, collaborating with legal experts to understand requirements and build custom scrapers and pipelines
- Design and implement monitoring systems to track data freshness, quality, and pipeline health across both public legal sources and private client document repositories
- Scale and manage our pipeline as we add new data sources, scale ingestion volume, and support more concurrent users
👤 Profile sought :
You are someone with experience building production pipelines and working with complex data sources, while maintaining high standards for quality and reliability.
You understand that legal professionals depend on accurate, timely data for critical decisions, and you take that responsibility seriously. At the same time, you know that building at a startup means shipping fast, making pragmatic tradeoffs, and sometimes choosing the solution that works today over the perfect architecture that might work tomorrow.
Experience :
- Built and scaled production data pipelines from scratch, handling everything from scraping and ETL to storage and monitoring
- Strong Python backend development experience with a focus on writing clean and maintainable code
- Proven track record deploying and managing services on cloud platforms (Azure, AWS) in production environments
- Hands-on experience with orchestration tools (Dagster, Airflow, Modal, or Prefect)
- Production experience with database technologies (PostgreSQL, Vespa, MongoDB or equivalent)
- Solid experience with infrastructure technologies (Kubernetes, Docker or equivalent)
Nice to have :
- Experience integrating enterprise document management systems (iManage, SharePoint, etc.) with complex permission handling into data pipelines
- Knowledge of building AI-powered data pipelines (embeddings, vector search, LLM preprocessing)
- Background working in regulated industries where data accuracy and audit trails aren't optional
🌍 Benefits & Culture :
- Competitive salary, according to experience
- Stock options
- 22 days of paid vacation + your birthday off
- Other perks such as flexible working location and schedule, health insurance and equipment budget
- Opportunity to shape and own the entire data architecture at a fast-moving early-stage startup
💼 Department : Tech
📍 Location : Portugal, Lisbon would be ideal for occasional meetings at the office
📆 Starting date : ASAP
- Locations
- Lisboa
- Remote status
- Fully Remote

Already working at Login.works?
Let’s recruit together and find your next colleague.