Ian Ozsvald's newsletter

Ian Ozsvald's data thoughts

Hello. You signed up to this list for Python-related Data Science thoughts and jobs (administered by Ian Ozsvald at Mor Consulting), you probably joined via Ian Ozsvald's blog.

You can UNSUBSCRIBE (please, do Unsubscribe if you don't need this any more!) if my thoughts and jobs aren't interesting. You can also update your profile to change your email address. Subscribe here - maybe a friend needs this link?

NumFOCUS fundraiser, faster Pandas strings, new memory debugger

James Powell of Numfocus is leading a fund raiser this weekend - pay what your want to watch him live code for 4 hours. He’s got a set of famous names joining for the live event - you’re likely to learn a ton!

Core Pandas developer Uwe Korn has done some fabulous work with the Fletcher library to build out a new Pandas string library. He’s used an old Kaggle example I’d worked in with Numba to accelerate string comparison operations along with Dask, this is well worth a read.

Early next year I’m running my next highly rated Software Engineering for Data Scientist’s class. This is for you if you need a stronger and standardised development process, want to develop faster using tests and want to refactor to make your code reusable.

The course runs on 3 UK mornings using Zoom and Slack in late February. There’s a limited set of Early Bird tickets and if you’re fast you can use this Christmas discount code for 10% off (XMAS2020_10PERCENT). Mail me back if you’ve got questions but don’t dally if you want to attend.

My old colleague Paul Ross got in touch about his new memory tracing debugger tool. If you need to hunt down memory leaks, maybe you’ll want to check his tool. He says: “As data sets become larger and larger the memory demands for processing increases enormously. PyMemTrace provides a collection of tools that help you understand where and by how much memory is being used during processing. The tools vary in granularity and invasiveness from just the overall total memory usage or down to line by line reporting of every malloc() and free(). These tools can highlight where excessive memory usage is slowing down data processing.”

My High Performance Python 2nd ed is on GoodReads where we'd dearly love a review please :-) Do let me know if you write a review and I can add it to my blog post.

Data Science Strategy and Coaching for Teams

I work with data science teams to quickly deliver strategic plans and I coach teams and executives in the execution of these plans - covering process, technique, communication, prioritisation, tooling and lots more. Get in contact if you think this might help your team. See the Recommendations on my LinkedIn profile for kind words from clients I've helped.


The roles are listed below. Contact details are listed against each job.

If you've got a job to share (this list has over 1,000 data scientists and data engineers) then get in contact. If you're an active data scientist who helps build our PyDataLondon community then you (not a colleague - just you!) get your first post for free and I take the time-cost as a contribution towards building our community. If you're from outside the community or a recruiter then commercial rates apply. In either case I vet the advert to make sure it is suitable for this list.

Do you need training? List-owner-Ian provides Python, data science and high performance training using both general Python tools and the Anaconda environment. You can reply to this email to contact Ian directly or join the training announcement list.

If you're interested in these sorts of roles then you might also want to come along to our PyDataLondon meetup each month as community members shout out their jobs at the end of each night, we have over 10,000 members.

Cheers, Ian Ozsvald

Senior Data Engineer at Spirit.AI, Permanent, London/Remote

We are looking for a Senior Data Engineer to join our Ally product team. Ally is an industry-leading suite of software products that enables organisations with online communities to safeguard their users against negative behaviour, such as grooming, racism, homophobia and bullying. Ally utilises sophisticated AI technologies such as language and behavioural analysis to automate the interception of abuse, all the time, in near real-time. Our clients include some of the games industry's biggest names and across all of our clients, we process billions of messages a month. As Ally grows, we aim to expand our market reach beyond the games industry. The successful candidate will be working across the stack, improving existing functionality, and developing new areas of the application as the product grows and adapts to client requirements. They will be responsible for optimising Ally's data and data pipeline architecture as well as optimising how data flows through the system. The nature of Ally's implementation means we don't rely on IaaS proprietary services. Being a software engineer is as important as understanding data engineering technologies.

Rate: competitive

Location: London / Remote

Contact: (please mention this list when you get in touch)

Side reading: link, link

Data Scientist - University College London

We are looking for a Data Scientist, ideally with a strong developer background, to join the Research Software Development Group at University College London. This is a permanent role with an excellent work-life balance and a special focus on health & sensitive data (experience in this domain is not required). We are looking for a candidate who loves learning and who wishes to use their skill set to make a difference in today's world. They will be working alongside world renowned clinicians and researchers, on a variety of projects, some involving one of the busiest intensive care units in London and others involving some of the most data-rich Covid studies currently in progress.

Rate: ~£45-52k depending on experience

Location: London/Remote

Contact: (please mention this list when you get in touch)

Side reading: link, link, link

Data Engineer at CRST International

We are seeking a Data Engineer to join our growing team. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The Data Engineer will support our software developers, software architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives.


Location: United States (Remote)

Contact: (please mention this list when you get in touch)

Side reading: link

Knowledge Graph & API Engineer, Entity-X

EntityX runs a large-scale bespoke knowledge graph infrastructure for cookie-free contextual targeting and brand safety. We’re a small team looking for a freelance backend dev to work on a variety of sub-projects: Improve entity disambiguation graph algorithms Produce per-entity sentiment analysis Integration with 3rd party publisher and ad server APIs Skills & experience: Python Graph experience v desirable AWS infrastructure Low-latency API development & deployment (GRPC & HTTP) PostgreSQL, Redis

Rate: Flexible part-time 3 month+ engagement; £400 day rate.

Location: UK based remote

Contact: (please mention this list when you get in touch)

If you've never tried a Python easter egg before - try "import this" at the command line.

This email was sent to <<Email Address>>
why did I get this?    unsubscribe from this list    update subscription preferences
MorConsulting Ltd (UK) · 93 Merlin Grove · Beckenham · London, Bromley BR3 3HS · United Kingdom

Email Marketing Powered by Mailchimp