June Newsletter

Hi everyone-

It’s a bank holiday weekend – again – so that means it’s June and hopefully some warmer weather as May has definitely not delivered on that front … perhaps a few curated data science reading materials might prove useful for sunshine in the garden?

Following is the June edition of our Royal Statistical Society Data Science Section newsletter. Hopefully some interesting topics and titbits to feed your data science curiosity … We are continuing with our move of Covid Corner to the end to change the focus a little.

As always- any and all feedback most welcome! If you like these, do please send on to your friends- we are looking to build a strong community of data science practitioners. And if you are not signed up to receive these automatically you can do so here.

Industrial Strength Data Science June 2021 Newsletter

RSS Data Science Section

Committee Activities

We are all conscious that times are incredibly hard for many people and are keen to help however we can- if there is anything we can do to help those who have been laid-off (networking and introductions help, advice on development etc.) don’t hesitate to drop us a line.

We are now ‘two for two’ on our ‘Fireside chat’ series! Following on from our fantastic discussion with Andrew Ng, Giles Pavey hosted an engaging and enlightening conversation with with Anthony Goldbloom on May 20th. Anthony is founder and CEO of Kaggle (now a Google company), the world’s largest data science and machine learning community. There was a great deal of insight into the evolution of data science over the 10 years Kaggle has been running as well as lots of audience questions. We will distill the session down and publish a summary shortly.

We will soon be releasing a survey to our readers and members focused on the UK Government’s proposed AI Strategy. We are passionate about making sure the government focuses on the right things in this area, and feel like true Data Science and AI practitioners need to feed into this process. So when you see the survey, do please take the time to fill it out if you can!

The full programme for this year’s RSS Conference, which takes place in Manchester from 6-9 September, has been confirmed.  The programme includes keynote talks from the likes of Hadley Wickham, Bin Yu and Tom Chivers.  Registration is open with early-bird discounts available until Friday 4 June. 
In addition, the RSS now has a new accreditation – Data Analyst.

Data Analyst is a registered form of professional membership status that provides formal recognition of a member’s statistical training and work-based experience at entry level

Martin Goodson, our chair, continues to run the excellent London Machine Learning meetup and is very active in with virtual events. The last event was on 24th May where Christian Szegedy, machine learning and AI researcher at Google Research, gave a talk titled ‘The Inverse Mindset of Machine Learning‘. Videos are posted on the meetup youtube channel – and future events will be posted here.

This Month in Data Science

Lots of exciting data science going on, as always!

Ethics and more ethics…
Bias, ethics and diversity continue to be hot topics in data science…

The real danger wasn’t “Deep Fakes.” The real danger is cheap fakes, fakes that can be produced quickly, easily, in bulk, and at virtually no cost
  • Regulators are rightly becoming increasingly active in an attempt to combat these issues. This HBR article helps map out what organisations need to know to be prepared.
  • We all know how complex ML models are becoming and the scale at which some of them now operate, and so we have to be open to the fact that mistakes will happen. The critical question becomes: what do you do about it when the issue surfaces? Twitter has taken a positive and transparent approach to dealing with some of their previous bias related issues in automated cropping, releasing a detailed and technical analysis about why it was happening and the steps they are taking to remove the bias:
We want to thank you for sharing your open feedback and criticism of this algorithm with us. As we discussed in our recent blog post about our Responsible ML initiatives, Twitter is committed to providing more transparency around the ways we’re investigating and investing in understanding the potential harms that result from the use of algorithmic decision systems like ML.
  • Really interesting discussion on the Kara Swisher’s Sway podcast with Daniel Kahneman (renowned behavioural economist – “Thinking Fast and Slow”) delving into why we require much higher accuracy from computers and technology than from humans before we are willing to trust them.
  • And in a similar vein, this is thought provoking– does more data necessarily mean better decision making?
  • Less specifically focused on bias and ethics, but really interesting commentary from Benedict Evans on Amazon and how much it really knows about what it sells, touching on how much of a responsibility a platform has for moderation of its own recommendation content.
Of Amazon’s top 50 best-sellers in “Children's Vaccination & Immunisation”, close to 20 are by anti-vaccine polemicists, and 5 are novels about fictional pandemics

Developments in Data Science…
As always, lots of new developments…

Real world applications of Data Science
Lots of practical examples making a difference in the real world this month!

How does that work?
A new section on understanding different approaches and techniques

Getting it live
How to drive ML into production

"For me, teaching this course was an unusual experience. MLOps standards and tools are still evolving, so it was exciting to survey the field and try to convey to you the cutting edge. I hope you will find it equally exciting to learn about this frontier of ML development, and that the skills you gain from this will help you build and deploy valuable ML systems." Andrew Ng

The Art of Visualisation
Making data science look right..

Practical Projects and Learning Opportunities
As always here are a few potential practical projects to keep you busy:

Covid Corner

Again, more positive progress in the UK on the Covid front with over 40m people now having received their first vaccine dose and over 25m fully vaccinated. However, the new variant originating in India is cause for concern.

 Experts gave a median estimate of 30,000 Covid deaths by the end of the year, whereas the non-experts said 20,000. The truth was around 75,000

Updates from Members and Contributors

  • Harald Carlens has put together a very useful comparison of cloud GPU services and pricing – definitely check it out if you are using deep learning in the cloud.
  • Lucie Burgess would like to announce an interesting set of discussions around the provenance and legality of automated decisions taking place on June 15th and June 22nd. Helix Data Innovation are running the sessions on behalf of the PLEAD project (King’s College London, University of Southampton, with partners Experian, Roke and Southampton Connect) – sign up here for what should be a good discussion on a very relevant topic
  • Kevin O’Brien highlights the upcoming UseR! 2021 conference on 5-9th of July – a must see for those R users out there

Again, hope you found this useful. Please do send on to your friends- we are looking to build a strong community of data science practitioners- and sign up for future updates here.

– Piers

The views expressed are our own and do not necessarily represent those of the RSS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: