-
Converting strings to numbers in ClickHouse
How to convert strings to integers and floats in ClickHouse. Controlling invalid values behavior on conversion.
Published this week in #data about #clickhouse -
How manage ingesting errors in ClickHouse
Managing errors when ingesting data into ClickHouse, including text data sources like CSV and TSV.
Published this week in #data about #clickhouse -
How to merge large tables in ClickHouse using join
How to merge multiple large tables into a single table based on a given column. A solution to MEMORY_LIMIT_EXCEEDED problem when joining large tables.
Published a week ago in #data about #clickhouse -
How to use Regex to feed text data to ClickHouse
Using regex input format can help in loading unformatted or broken text data into Clickhouse. Using Regexp format for that with a practical example.
Published this month in #data about #clickhouse -
Formatting unstructured data using OpenAI API and Python
How to use OpenAI to format unstructured text data, e.g. CSV. Setting additional formatting requirements to format specific values in the resulting CSV.
Published this month in #machinelearning about #python and #openai -
Quick start OpenAI API example using Python
How to start using OpenAI API with Python. A simple example of a Python script that generates data based on the OpenAI language model.
Published this month in #machinelearning about #python and #openai -
Using Sphinx to add full-text search to Clickhouse
How to configure Sphinx to index text data from Clickhouse. What IDs to use for Clickhouse documents with Sphinx. How to build an index and resolve found documents in Clickhouse.
Published a month ago in #data about #clickhouse and #sphinx -
How to use multiple disks in Clickhouse
How to configure multiple disks as storages in Clickhouse, and how to use different disks for different tables in Clickhouse.
Published a month ago in #data about #clickhouse -
What is a function derivative and how to optimize functions
The article explains what a function derivative is on a very basic level. Starting from the concept of the function, we move along function changes and finally, look at a Python example of optimizing a function based on its derivative.
Published a month ago in #machinelearning about #math, #derivative and #python -
Matrices and vectors math for AI with Python examples
Article provides an introduction to vectors and matrices, two fundamental concepts in linear algebra, which are widely used in artificial intelligence. It explains what vectors and matrices are and how they are defined in math. Basic operations with vectors and matrices using Python, including adding, multiplying, and transposing matrices.
Published a month ago in #machinelearning about #math, #matrix and #vector -
Creating a bigram language model for text generation with Python
Understanding bigram language models, which are statistical models that predict the likelihood of a word given its preceding word. Includes an example of a simple bigram language model in Python.
Published a month ago in #machinelearning about #nlp, #language-models and #python -
What is a language model and how it works
Basics about language models, which are algorithms that enable computers to analyze and understand human language. The article explains how language models work and how they are trained, using a simple example of a program that can understand and respond to simple questions.
Published a month ago in #machinelearning about #nlp and #language-models -
What is Machine Learning and how it works
Machine Learning basics, the math behind machine learning, predictions, prediction errors, training dataset, validation dataset.
Published a month ago in #machinelearning -
Using csvkit to format, clean, and fix CSV files
Formatting CSV, TSV, and other files, converting CSV delimiters, converting CSV quoting symbols, fixing invalid CSV files, working with compressed CSV files
Published a month ago in #programming about #python and #csv -
Reading CSV, TSV, and invalid CSV files with Golang
Reading CSV with Golang line by line or entirely, reading CSV with custom delimiters (including TSV) and escaping rules, and reading broken CSV files.
Published a month ago in #programming about #golang and #csv -
Welcome to DataChild - place to learn data programming and ML
This is a welcoming post about the idea behind this place, basic approaches, target audience and goals.
Published a month ago in #data