Data Craze Weekly #1

This message was sent first to subscribers of Data Craze Weekly newsletter.

Data Craze Weekly

Weekly dose of curated informations from data world!
Data engineering, analytics, case studies straight to your inbox.

    No spam. Unsubscribe at any time.


    The administrator of personal data necessary in the processing process, including the data provided above, is Data Craze - Krzysztof Bury, Piaski 50 st., 30-199 Rząska, Poland, NIP: 7922121365. By subscribing to the newsletter, you consent to the processing of your personal data (name, e-mail) as part of Data Craze activities.


    This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

    Week in Data

    CDC - Change Data Capute the Netflix way

    Article is a bit old, but still brings a lot of value. Especially if you want a gentle introduction to CDC - Change Data Capture.

    You can get from it, what is CDC, what you can expect out of it and how Netflix approached this topic.

    Link: https://netflixtechblog.com/dblog-a-generic-change-data-capture-framework-69351fb9099b

    PostgreSQL 15 without default CREATE for PUBLIC schema

    Starting from version 15 of PostgreSQL, default CREATE privilage for none admin / superuser privilege will be revoked from PUBLIC schema. If you are relying heavily on public schema please read it carefully.

    Link: https://andreas.scherbaum.la/blog/archives/1120-Changes-to-the-public-schema-in-PostgreSQL-15-and-how-to-handle-upgrades.html

    Documenting Data yet again

    The longer we work with data, models, tables, schemas etc. The more we are starting to get lost in all of these objects.

    Questions are starting to arise:

    What is this table for?
    Is it being used?
    What was it’s goal?

    I guess you can agree that documenting stuff is not the most attractive part of working with data … but hell it is essential. The sooner we realise that the easier our work will be down the road.

    Fortunately or unfortunately we as people working with data are like librarians, we are creating order out of chaos (by cataloging, describing etc. data). In attached articles author is describing some best practices to help us out in this journey.

    Link: https://towardsdatascience.com/data-documentation-best-practices-3e1a97cfeda6

    Tools

    sql_formatter - are you working in a Team were you are constantly discussing where commas should land in SQL (at the end or at the beginning of an attribute)?

    Why don’t you use a formatter that you can seamlessly add to you git workflow, and will do all the work for you - with one proper way of setting commas (at the beginning 🙂).

    Link: https://github.com/PabloRMira/sql_formatter

    Check Your Skills

    Without using MAX function get second biggest identifier of a product (column PRODUCT_ID) from PRODUCTS.

    #SQL

    Answer HERE


            SELECT product_id
              FROM products
          ORDER BY product_id DESC
             LIMIT 1
            OFFSET 1;
    

    Data Jobs