Welcome to today’s daily kōrero!

Anyone can make the thread, first in first served. If you are here on a day and there’s no daily thread, feel free to create it!

Anyway, it’s just a chance to talk about your day, what you have planned, what you have done, etc.

So, how’s it going?

  • catsdoingcatstuff@lemmy.nz
    link
    fedilink
    arrow-up
    3
    ·
    11 months ago

    I’m a data engineer working mostly in Python and sql embedded in a data analytics team. Our main use cases are for ingestion pipelines (API sources, glue scripts, batch jobs in airflow and aws), and some work in pandas that doesn’t fit into our dbt sql models. I think it’s also nice for data exploration and sharing via jupyter/colab notebooks.

    What are you thinking of using it for?

    • NoRamyunForYou@lemmy.nz
      link
      fedilink
      arrow-up
      3
      ·
      11 months ago

      There’s a few different reasons that I’ve though about for now:

      • A lot of the data that we are working with is quite large, and it’s sometimes a struggle to work with it in Google Sheets / Excel (Unfortunately our workplace uses both for some reason)

      • I have some weekly reports that I’ve somehow ended up generating (Getting data via SQL, massaging the data, and presenting via a dashboard or sharing a spreadsheet.

      • For creating a repeatable set of calculations when someone asks for something (which I’m sort of doing via Powerquery or Google Apps Script)

      • I’m quite big on visualizations, so I want to give Matplotlib a go.

      • And I do of coding (Javascript & C++(Arduino)), and have always wanted to add Python to my list of skills, especially in recent times, as I begin to delve more into Data.

      • catsdoingcatstuff@lemmy.nz
        link
        fedilink
        arrow-up
        3
        ·
        11 months ago

        Those sound like perfect scenarios! One of the first projects that got me hooked on python was processing large csv files instead of opening them in excel and running visual basic on them.

        If you haven’t already, you should check out duck db for working with your larger data sets, too. It’s pretty neat. https://duckdb.org/

        • NoRamyunForYou@lemmy.nz
          link
          fedilink
          arrow-up
          1
          ·
          11 months ago

          I’ve had a brief look into duckdb, and not too sure if I’m interpreting it’s use case correctly, but does it basically allow you to use SQL within your Python to query your large datasets that you have locally?

          • catsdoingcatstuff@lemmy.nz
            link
            fedilink
            arrow-up
            2
            ·
            11 months ago

            That’s right. You can read in structured files and query them locally without having to load into a database. It’s nice in the case where you would rather write analytics sql, or want to convert between sql and pandas. It’s very quick to load and run files. It can connect to databases, too.