Circular Statistics in Python: An Intuitive Intro

In this contributed article, Amit Babayoff, a data scientist at Deeyook, discusses the principles of circular statistics, by looking at some its basic principles and tools and why conventional linear methods don’t work well on circular data. She also explores how a simple filtering for handling noise can be constructed from these basic tools.

Video Highlights: BigQuery + Notebooks: Building an Analytics Pipeline on Kaggle

Your architecture choices impact how efficiently you’re able to use your data. In this “Snapshots” video produced by Kaggle, Data Scientist Wendy Kan demonstrates how she incorporates BigQuery and Kaggle Notebooks into her workflow. Watch her create an interactive network analysis graph that explores the most commonly installed Python packages!

Altair and Cylc Take Weather Prediction by Storm

In this Sponsored Post, our friends over at Altair explain that to keep the world’s weather sites running smoothly, Altair and the Cylc open source community have packaged an industry-leading workload manager, Altair PBS Professional™, together with the Cylc workflow engine plus other helpful plug-ins to create the Altair Weather Solution.

Video: Profiling Python Workloads with Intel VTune Amplifier

Paulius Velesko from Intel gave this talk at the ALCF Many-Core Developer Sessions. “This talk covers efficient profiling techniques that can help to dramatically improve the performance of code by identifying CPU and memory bottlenecks. Efficient profiling techniques can help dramatically improve the performance of code by identifying CPU and memory bottlenecks. We will demonstrate how to profile a Python application using Intel VTune Amplifier, a full-featured profiling tool.”

The Impact of Python: How It Could Rule the AI World?

In this contributed article, writer, AI researcher, and business strategist Michael Lyman discusses the growth of use of the Python language and how it is playing a significant role in the rise of AI and deep learning. Python’s power and ease of use has catapulted it to become one of the core languages to provide machine learning solutions.

CUDA-Python and RAPIDS for blazing fast scientific computing

Abe Stern from NVIDIA gave this talk at the ECSS Symposium. “We will introduce Numba and RAPIDS for GPU programming in Python. Numba allows us to write just-in-time compiled CUDA code in Python, giving us easy access to the power of GPUs from a powerful high-level language. RAPIDS is a suite of tools with a Python interface for machine learning and dataframe operations. Together, Numba and RAPIDS represent a potent set of tools for rapid prototyping, development, and analysis for scientific computing. We will cover the basics of each library and go over simple examples to get users started.”

Joe Landman on How the Cloud is Changing HPC

In this special guest feature, Joe Landman from Scalability.org writes that the move to cloud-based HPC is having some unexpected effects on the industry. “When you purchase a cloud HPC product, you can achieve productivity in time scales measurable in hours to days, where previously weeks to months was common. It cannot be overstated how important this is.”

Podcast: When a Different OS Gets Different Results

In this podcast, the Radio Free HPC team looks at problems in the scientific software world. “There’s a bug in Python scripts that caused different results in identical routines run on different operating systems. As the guys discuss, it’s not a Python thing but a problem with the order in which files got read according to the operating system’s protocols. This impacts the sort order and thus the end results. The gang speculates on other causes of these types of problems and the fixes that should be employed.”

Parallel Computing in Python: Current State and Recent Advances

Pierre Glaser from INRIA gave this talk at EuroPython 2019. “Modern hardware is multi-core. It is crucial for Python to provide high-performance parallelism. This talk will expose to both data-scientists and library developers the current state of affairs and the recent advances for parallel computing with Python. The goal is to help practitioners and developers to make better decisions on this matter.”

Interview: Terry Deem and David Liu at Intel

I recently caught up with Terry Deem, Product Marketing Manager for Data Science, Machine Learning and Intel® Distribution for Python, and David Liu, Software Technical Consultant Engineer for the Intel® Distribution for Python*, both from Intel, to discuss the Intel® Distribution for Python (IDP): targeted classes of developers, use with commonly used Python packages for data science, benchmark comparisons, the solution’s use in scientific computing, and a look to the future with respect to IPD.