You can UNSUBSCRIBE (please, do Unsubscribe if you don't need this any more!) if my thoughts and jobs aren't interesting. You can also update your profile to change your email address. Subscribe here - maybe a friend needs this link?
James Powell of Numfocus is leading a fund raiser this weekend - pay what your want to watch him live code for 4 hours. He’s got a set of famous names joining for the live event - you’re likely to learn a ton!
Core Pandas developer Uwe Korn has done some fabulous work with the Fletcher library to build out a new Pandas string library. He’s used an old Kaggle example I’d worked in with Numba to accelerate string comparison operations along with Dask, this is well worth a read.
Early next year I’m running my next highly rated Software Engineering for Data Scientist’s class. This is for you if you need a stronger and standardised development process, want to develop faster using tests and want to refactor to make your code reusable.
The course runs on 3 UK mornings using Zoom and Slack in late February. There’s a limited set of Early Bird tickets and if you’re fast you can use this Christmas discount code for 10% off (XMAS2020_10PERCENT). Mail me back if you’ve got questions but don’t dally if you want to attend.
My old colleague Paul Ross got in touch about his new memory tracing debugger tool. If you need to hunt down memory leaks, maybe you’ll want to check his tool. He says: “As data sets become larger and larger the memory demands for processing increases enormously. PyMemTrace provides a collection of tools that help you understand where and by how much memory is being used during processing. The tools vary in granularity and invasiveness from just the overall total memory usage or down to line by line reporting of every malloc() and free(). These tools can highlight where excessive memory usage is slowing down data processing.”
My High Performance Python 2nd ed is on GoodReads where we'd dearly love a review please :-) Do let me know if you write a review and I can add it to my blog post.
I work with data science teams to quickly deliver strategic plans and I coach teams and executives in the execution of these plans - covering process, technique, communication, prioritisation, tooling and lots more. Get in contact if you think this might help your team. See the Recommendations on my LinkedIn profile for kind words from clients I've helped.
The roles are listed below. Contact details are listed against each job.
If you've got a job to share (this list has over 1,000 data scientists and data engineers) then get in contact. If you're an active data scientist who helps build our PyDataLondon community then you (not a colleague - just you!) get your first post for free and I take the time-cost as a contribution towards building our community. If you're from outside the community or a recruiter then commercial rates apply. In either case I vet the advert to make sure it is suitable for this list.
Do you need training? List-owner-Ian provides Python, data science and high performance training using both general Python tools and the Anaconda environment. You can reply to this email to contact Ian directly or join the training announcement list.
If you're interested in these sorts of roles then you might also want to come along to our PyDataLondon meetup each month as community members shout out their jobs at the end of each night, we have over 10,000 members.
Cheers, Ian Ozsvald
We are looking for a Senior Data Engineer to join our Ally product team. Ally is an industry-leading suite of software products that enables organisations with online communities to safeguard their users against negative behaviour, such as grooming, racism, homophobia and bullying. Ally utilises sophisticated AI technologies such as language and behavioural analysis to automate the interception of abuse, all the time, in near real-time. Our clients include some of the games industry's biggest names and across all of our clients, we process billions of messages a month. As Ally grows, we aim to expand our market reach beyond the games industry. The successful candidate will be working across the stack, improving existing functionality, and developing new areas of the application as the product grows and adapts to client requirements. They will be responsible for optimising Ally's data and data pipeline architecture as well as optimising how data flows through the system. The nature of Ally's implementation means we don't rely on IaaS proprietary services. Being a software engineer is as important as understanding data engineering technologies.
Location: London / Remote
Contact: firstname.lastname@example.org (please mention this list when you get in touch)
We are looking for a Data Scientist, ideally with a strong developer background, to join the Research Software Development Group at University College London. This is a permanent role with an excellent work-life balance and a special focus on health & sensitive data (experience in this domain is not required). We are looking for a candidate who loves learning and who wishes to use their skill set to make a difference in today's world. They will be working alongside world renowned clinicians and researchers, on a variety of projects, some involving one of the busiest intensive care units in London and others involving some of the most data-rich Covid studies currently in progress.
Rate: ~£45-52k depending on experience
Contact: email@example.com (please mention this list when you get in touch)
We are seeking a Data Engineer to join our growing team. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The Data Engineer will support our software developers, software architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives.
Location: United States (Remote)
Contact: firstname.lastname@example.org (please mention this list when you get in touch)
Side reading: link
EntityX runs a large-scale bespoke knowledge graph infrastructure for cookie-free contextual targeting and brand safety. We’re a small team looking for a freelance backend dev to work on a variety of sub-projects: Improve entity disambiguation graph algorithms Produce per-entity sentiment analysis Integration with 3rd party publisher and ad server APIs Skills & experience: Python Graph experience v desirable AWS infrastructure Low-latency API development & deployment (GRPC & HTTP) PostgreSQL, Redis
Rate: Flexible part-time 3 month+ engagement; £400 day rate.
Location: UK based remote
Contact: email@example.com (please mention this list when you get in touch)