April Newsletter

Hi everyone-

Another month flies by… still cold, but I’ve definitely seen the sun once or twice… I hope the on-again off-again dreams of a proper summer holiday aren’t proving too painful … perhaps a few curated data science reading materials might ease the burden over the Easter weekend?

Following is the April edition of our Royal Statistical Society Data Science Section newsletter. Hopefully some interesting topics and titbits to feed your data science curiosity …

As always- any and all feedback most welcome! If you like these, do please send on to your friends- we are looking to build a strong community of data science practitioners.

Industrial Strength Data Science April 2021 Newsletter

RSS Data Science Section

Covid Corner

It definitely feels like progress, at least in the UK, on the Covid front, with over 30m people now having received their first vaccine dose. Supply issues notwithstanding, it is clear that the vaccine roll-out is progressing very well.

  • It is now over a year since the UK first went into lockdown to attempt to restrict the spread of the virus. It’s interesting to reflect on how much data and statistics have become part of general public discussion: we still have daily updates of a number of different metrics on the news and published in papers. ‘More or Less’ has a nice summary of the UK’s efforts to collate and disseminate the figures and how the centralised healthcare setup contrasts favourably with the US, which required volunteers to generate national figures in the Covid Tracking Project.
  • Despite (or perhaps because of) the proliferation of data, the statistics have been made to argue many sides of the same case as highlighted in this research from MIT, stressing the importance of good visualisations.
  • It has been quite a month for Astra Zeneca …
 “Overall it’s a win for the world”

Committee Activities

We are all conscious that times are incredibly hard for many people and are keen to help however we can- if there is anything we can do to help those who have been laid-off (networking and introductions help, advice on development etc.) don’t hesitate to drop us a line.

Our first ‘Ethics Happy Hour’ on March 17th was very well received – see the write up here. The video recording will shortly be posted on youtube and we will publish links to it when it is available. Please let us know if you have any comments or would like to suggest topics for future events via email to dss.ethics@gmail.com

Fresh on the heels of our incredibly successful event with Andrew Ng, we are excited to announce the next instalment in the series. The RSS Data Science section invites you to a fireside conversation with Anthony Goldbloom – founder and CEO of Kaggle (now a Google company), the world’s largest data science and machine learning community with over 6MM members. Forbes has twice named Anthony one of the 30 under 30 in technology, the MIT Technology Review has named him as one of the 35 Innovators Under 35 and the University of Melbourne has given Anthony an Alumni of Distinction Award. Hear Anthony share his thoughts and experiences from the past 10 years at the forefront of competitive Machine Learning. Watch this space for more details!

Martin Goodson, our chair, continues to run the excellent London Machine Learning meetup and continues to be very active in with virtual events. The next event is on 7th April where Mike Lewis, research scientist at Facebook AI Research in Seattle, will give a talk titled ‘Beyond BERT: Representation Learning for Natural Language at Scale’ . Videos are posted on the meetup youtube channel – and future events will be posted here.

Elsewhere in Data Science

Lots of non-Covid data science going on, as always!

Ethics and more ethics…
Bias, ethics and diversity continue to be hot topics in data science…

"I will have a lot more to say about this later. But announcing a new org by a Black woman as if we’re all interchangeable while harassing, terrorizing and gaslighting my team and doing absolutely ZERO to acknowledge & redress the harm that’s been done is beyond gaslighting."
"Everything the company does and chooses not to do flows from a single motivation: Zuckerberg’s relentless desire for growth."

Developments in Data Science…
As always, lots of new developments…

The Practical side … getting stuff to work in production

"When a system isn’t performing well, many teams instinctually try to improve the Code. But for many practical applications, it’s more effective instead to focus on improving the Data."
"It’s a common joke that 80 percent of machine learning is actually data cleaning, as though that were a lesser task. My view is that if 80 percent of our work is data preparation, then ensuring data quality is the important work of a machine learning team."

How does that work?
A new section on understanding different approaches and techniques

Thinking about intelligence and bigger picture stuff
Stepping back from the code for a bit…

  • Thought provoking article proposing that “Computers will never write good novels” – definitely worth thinking through how much of this you agree with
"The best that computers can do is spit out word soups. They leave our neurons unmoved."
"Employees are far happier when they are led by people with deep expertise in the core activity of the business."

Practical Projects and Learning Opportunities
As always here are a few potential practical projects to while away the socially distanced hours:

Updates from Members and Contributors

  • Marco Gorelli is running an excellent workshop on 10th April about contributing to Pandas. The workshop is being run in collaboration with PyLadies and is specifically targeting people from underrepresented genders in tech. Sign up for the morning session or the afternoon session.
  • Emre Kasim is running the brilliant Algo Conference which this year is taking place online on April 29th with a number of very relevant streams, including ‘Foundational AI’, ‘AI and Innovation’ and ‘Implications of AI and other Disruptive Technologies- well worth signing up for here.
  • Alex Spanos highlights the upcoming Data Science Festival which in April is focused on Fintech- check out his talk on Data Science/Machine Learning and Open Banking APIs on April 15th.
  • Vijay Kumar Mishra, Research Scientist at Public Health for India, is running a 5-day online international workshop on ‘’Designing and Conducting Clinical Trials” from the 3rd to the 7th of May. The workshop will be jointly conducted by Public Health Foundation of India, Sitaram Bhartia Institute of Science and Research, Paropakar Maternity and Women Hospital and University College London and will be aimed at providing a theoretical understanding of designing and conducting clinical trials. Contact Vijay (vijay.mishra@phfi.org) for more details.
  • Harin Sellahewa draws our attention to the 35 of 70 masters students entering their final assessment for the University of Buckingham MSc in Applied Data Science- best of luck to everyone!

Again, hope you found this useful. Please do send on to your friends- we are looking to build a strong community of data science practitioners- and sign up for future updates here:

Processing…
Success! You're on the list.

– Piers

The views expressed are our own and do not necessarily represent those of the RSS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: