Constantin is an experienced Data Engineer with a proven track record across multiple industries. He holds certifications as a GCP Professional Data Engineer and AWS Certified Data Analytics Specialist. Skilled in Python, C#, Scala, BigQuery, Spark, and Microsoft BI Stack, he excels in PowerBI, Tableau, and DataStudio. Proficient in SQL Server, MySQL, MongoDB, and DevOps tools like Airflow and Docker. Fluent in English, French, Romanian, and Russian, and conversational in German.
System variables in BigQuery are special variables available in multi-statement queries, allowing users to read and sometimes write query metadata during execution. Examples include @@query_label for query labels, @@time_zone for default time zones, and @@script.job_id for the current job ID. They help set or retrieve information similar to user-defined variables like @@current_job_id and @@dataset_id, providing specific data about the query context.
BigQuery's job history feature saves every query, providing access to personal and project history. It allows retrieval of unsaved complex queries, viewing results of long-running queries without re-execution, and checking job details like duration, resources, and timings. Users can also see all project-executed jobs, including those by external clients, and search and filter queries.
The SQL Anti-Join retains rows in one table not found in another. Achieved with LEFT JOIN + WHERE [key in right table] IS NULL for LEFT ANTI JOIN and RIGHT JOIN + WHERE [key in left table] IS NULL for RIGHT ANTI JOIN. Unlike EXCEPT, which considers any column difference, ANTI JOIN focuses on join keys. Example: LEFT ANTI JOIN finds products with details but no pricing row.
Get the most out of your Google Analytics data with Marketing Auditor – a powerful tool that helps you uncover and fix errors in minutes. Generate white-label reports packed with actionable insights and a customized action plan to optimize your analytics setup effortlessly. Customize reports with professional themes or your branding, and export them in editable formats like PowerPoint or Google Slides.
NULL in SQL represents the absence of a value, distinct from an empty string or zero. Key points include: NULL cannot be tested in a list, comparisons like NULL = NULL are not allowed, COUNT(column) excludes NULLs while COUNT(*) includes them, aggregate functions ignore NULLs, ORDER BY places NULLs first by default, and joins do not match NULLs. Handling NULLs can be done using IS NULL, COALESCE, IFNULL/ISNULL, and NULLIF.
FLOAT (FLOAT64) is a double-precision floating-point number that can express a large array of values, uses 8 logical bytes, and offers faster calculations but may yield rounding errors. It suits queries tolerating small differences and scientific calculations. NUMERIC is a fixed-point decimal type with up to 38 digits and 9 decimal places, using 16 logical bytes, providing exact storage without rounding errors, ideal for finance or precise calculations.
LOGICAL_AND and LOGICAL_OR are versatile non-standard SQL aggregation functions in BigQuery. LOGICAL_OR checks if at least one value in a grouping is TRUE, while LOGICAL_AND checks if all values are TRUE. They can be paired with NOT, used with GROUP BY and HAVING, or with window functions and QUALIFY. Examples include checking if all customer orders are paid, if any orders are outstanding, or if olives were ordered in the last 3 months.
Using LIMIT in BigQuery doesn't save costs as it doesn't reduce data processed, only the number of results returned. However, it can be useful for validating data assumptions by quickly checking for issues like duplicate records. If no results are returned with LIMIT, it supports the initial hypothesis. While there's no cost difference, using LIMIT can save time on large tables. Check comments for cases where LIMIT affects performance.
The NOT NULL constraint in BigQuery allows users to declare if a column should not be null when creating a table. This can be checked on an existing table by examining the column's MODE, which can be NULLABLE, REQUIRED (if NOT NULL is set), or REPEATED (for ARRAYS). If data insertion or update violates this constraint, the statement will fail. Setting the correct MODE for a field enforces data expectations.
BigQuery has introduced a new feature, currently in preview, that simplifies row-level access security management. Unlike before, users no longer need to manually specify access filter values. Instead, they can refer to a lookup table and use it with the SESSION_USER function to filter the values a principal should see. This change enhances simplicity and ease of managing row-level security.