Best Audio Format for Storage?

stevecrox@kbin.social · 1 year ago

Best Audio Format for Storage?

stevecrox@kbin.social · 2 years ago

Basic rule if someone claims X magically solves a problem they don’t follow X and are a huge generator of the problem.

For example people who claim they don’t need to write comments because they write self documenting code are the people that use variable names x1,x2,y, etc…

Similarly anyone you meet claiming Test Driven Development means they have better tests will write code with appalling code coverage and epically bad tests.

stevecrox@kbin.social · edit-2 2 years ago

This advice isn’t grounded in reality.

Management normally defines ways to track and judge itself, these are typically called Key Performance Indicators.

KPI’s are normally things like contract value growth, new contracts signed, profit margin, etc…

So if the project manager is meeting or exceeding their KPI’s and you walk up to their boss telling them the PM is failing as basic job functions, the boss won’t care.

This is because the boss might have set the KPI’s or the boss might also be judged on them. In either situation its to the bosses advantage to ignore you.

The boss will only care if there is a KPI you can demonstrate the PM failing to meet.

Every person/group will have various incentives and motivations. To affect change you have to understand what they are.

stevecrox@kbin.social · 2 years ago

A project manager has responsibility for delivery of a project but they typically lack domain specific knowledge. As a result they can’t directly deliver something, merely ask subject matter experts for advice and facilitate a team to deliver.

Most PM’s cope with the stress of this position poorly.

This cartoon is an example of micro management (a common coping mechanisim), the manager has involved themselves in the low level decisions because that gives a sense of control. If a technical team then tell them its a bad decison the team are effectively attacking their coping mechanisim.

The solution isn’t to tell them their technical idea is terrible, when you’ve fallen down this rabbit hole you have to treat the PM as a stakeholder. They are someone you have to manage, so a common solution is to give them confidence there is a path to delivery, a way to track and understand it.

stevecrox@kbin.social · 2 years ago

During the pandemic I had some unoccupied python graduates I wanted to teach data engineering to.

Initially I had them implement REST wrappers around Apache OpenNLP and SpaCy and then compare the results of random data sets (project Gutenberg, sharepoint, etc…).

I ended up stealing a grad data scientist because we couldn’t find a difference (while there was a difference in confidence, the actual matches were identical).

SpaCy required 1vCPU and 12GiB of RAM to produce the same result as OpenNLP that was running on 0.5 vCPU and 4.5 GiB of RAM.

2 grads were assigned a Spring Boot/Camel/OpenNLP stack and 2 a Spacy/Flask application. It took both groups 4 weeks to get a working result.

The team slowly acquired lockdown staff so I introduced Minio/RabbitMQ/Nifi/Hadoop/Express/React and then different file types (not raw UTF-8, but what about doc, pdf, etc…) for NLP pipelines. They built a fairly complex NLP processing system with a data exploration UI.

I figured I had a group to help me figure out Python best approach in the space, but Python limitations just lead to stuff like needing a Kubernetes volume to host data.

Conversely none of the data scientists we acquired were willing to code in anything but Python.

I tried arguing in my company of the time there was a huge unsolved bit of market there (e.g. MLOP’s)

Alas unless you can show profit on the first customer no business would invest. Which is why I am trying to start a business.

stevecrox@kbin.social · 2 years ago

This is why Java rocks with ETL, the language is built to access files via input/output streams.

It means you don’t need to download a local copy of a file, you can drop it into a data lake (S3, HDFS, etc…) and pass around a URI reference.

Considering the size of Large Language Models I really am surprised at how poor streaming is handled within Python.

stevecrox@kbin.social · edit-2 2 years ago

@ergoplato I didn’t suggest that.

Personally I don’t think its ego. I think you have two issues.

The first is people go through stages learning DevOps. Stage 1 has people deploy a CI because its cool, they build a few basic pipelines and then 90% of people get bored. The 2nd stage is people start extending those pipelines, it results in really complex pipelines requiring lots of unique changes based on the opinion of the writer. You move to the 3rd stage when your asked to recreate/extend for a new project and realise how specific your solutions are.

Learning how to make minor tweaks and hook in a few key points to get what you want takes years. Without that most packagers will want to make big changes upstream which won’t go down well.

The second issue, I have met quite a few developers who become highly stressed when the build system is doing something they haven’t needed to do or understand.

A really simple example I have a Jenkins function which I tend to slip into release pipelines, it captures the release version and creates a version in Jira.

I normally deploy it first as a test before a few other functions to automate various service management requirements.

Its surprising how many devs will suddenly decide every problem (test failed, code failed review, sharepoint breaks, bad os update, etc…) is due to that function.

For me this little function is a test, if the team don’t care I will work to integrate various bits. If they freak out, I’ll revert decide if it is worth walking them through the process or walk away.

stevecrox@kbin.social · edit-2 2 years ago

One of the reasons for the #DevOps movement is developers see building and packaging as #notmyjob.

The task would historically fall on the most junior member of the team, who would make a pigs ear out of it due to complete lack of experience.

This is compounded by the issue that most C/C++ build systems don’t really include dependency management.

Linux distributions have all tried to work out those dependency trees but they came up with slightly different solutions. This is why there are a few “root” distributions everything branches from.

That means developers have to learn about a few root distributions to design a deb/rpm/aur package systems to base their release around.

That is a considerable amount of learning in a subject most aren’t interested in.

The real question is why don’t package maintainers upstream a packaging solution?

stevecrox@kbin.social · 2 years ago

I am currently teaching python and JavaScript devs Typescript. Everytime they hit a problem they switch to any

Sigh

stevecrox@kbin.social · edit-2 2 years ago

Its even better…

In C/C++ you have to create/destroy objects, deal with pointers (**& for the win!). As a result the average C/C++ developer has to think about alot more than the average Java developer.

In a previous job there were a lot of legacy C and C++ applications. They wanted to replace some components with Java local applications because Eclipse RCP was the new cool thing.

So we would build it and then have 2-4 weeks for performance tuning. It basically involved one person from the team attaching Yourkit profiler and either nudging code to pure OO or C like procedural at bottleneck points.

Every single time the result used less CPU and RAM than the C/C++ application when running through the original applications test packs. Even when those applications had gone through multiple rounds of performance tuning.

We got given some time to figure out why, our conclusion was while any one part of the Java application would be slower, the reduced mental load lead to better performance in total.

stevecrox@kbin.social · 2 years ago

Java 11 allows you to paste in JavaScript and it largely just works, which is cool and terrible.

There are some really good ideas but I think a split is coming.

Java 17 code looks like an entirely different language from Java 1.8. With the pace of changes keeping backwards compatibility is increasingly difficult.

The actual backend improvements keep being pushed.

stevecrox@kbin.social · edit-2 2 years ago

Its a really immature and niave response from Kev. Information is power, he’s chosen to operate without knowledge for internet points.

Meta think there is potential to enlarge their market and make money, Kev’s response won’t impact their business making decisions.

Kev should have gone to the meeting to understand what Meta are planning. That would help him figure out how to deal with Meta entering the space.

I don’t expect he could shape their approach but knowing they want to do X, Y or Z might make certain features/fixes a priority so it doesn’t impact everyone else

stevecrox@kbin.social · edit-2 2 years ago

I think there is a focus on C/C++ to justify Pythons performance.

There have been times when the performance of Node js/Python were part of the reasoning for choosing Java/Scala.

Each time you are regaled with how Python is C/C++ underneath and so faster. Each time you have to ask if they will write C/C++ libraries to ensure the application performance meets our needs.

Similarly any time you go near lambas you get a comment on how python lambdas are faster because its C running, where as Java has to start a virtual machine, etc…

3 different companies samething

stevecrox@kbin.social · edit-2 2 years ago

Interview with a Postdoc, Junior Python Developer

stevecrox@kbin.social · 2 years ago

The big argument for mono repos is checking out multiple repositories is “hard” while checking out one repository is “easy” but…

Service Oriented Architecture became a thing because monolithic code bases were often becoming spaghetti. I worked on a project where removing an option from a preference window (max map zoom), broke a message table (because the number of visible rows in a table (not its size in the UI) was linked to the max zoom you supplied to a map library, for no reason).

Thus the idea you should wrap everything you do as a self contained service, with a known interface. The idea being you could write an entirely new implementation of a service, implement the interface and everything would work. Microservices are a continuation of this idea.

Yet every node/python based mono repo I have seen will have python files directly imported filed from inside anouther component/service. Not simply common aretfacts but all sorts of random parts. Subverting the concept of micro service (and recreating the problem).

Separate repositories block this because each repository will be built in isolation on a CI, flagging the link. This forces you to release each repository and pull things in as a dependency. Which encourages you to design code to support that.

A common monorepo problem is to shove everything in a docker image and call it a day. Then if you need a class from one monorepo in anouther one, you don’t have an artefacts so lazy devs just copy/paste files between monorepos.

Monorepos aren’t bad practice by themselves, they encourage bad practice. Separate repositories encourage good practice (literally the need to manage them separately drives it).

stevecrox

Best Audio Format for Storage?

Best Audio Format for Storage?

Interview with a Postdoc, Junior Python Developer

Interview with a Postdoc, Junior Python Developer