-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.xml
659 lines (546 loc) · 38.4 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>tomis9's cookbook</title>
<link>https://tomis9.github.io/</link>
<description>Recent content on tomis9's cookbook</description>
<generator>Hugo -- gohugo.io</generator>
<lastBuildDate>Fri, 11 Oct 2019 19:38:20 +0200</lastBuildDate>
<atom:link href="https://tomis9.github.io/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>beautiful soup</title>
<link>https://tomis9.github.io/beautiful_soup/</link>
<pubDate>Fri, 11 Oct 2019 19:38:20 +0200</pubDate>
<guid>https://tomis9.github.io/beautiful_soup/</guid>
<description>1. What is beautiful soup and why would you use it? it&rsquo;s a web scraping Python package
sometimes your data is allocated on various pages on the internet. Beautiful Soup turns out to be super-helpful in automated collecting of this sort of data.
btw. I love the name. It sounds so randomly.
2. The basics Beautiful Soup can &ldquo;understand&rdquo; html code, which you download from the internet using requests module:</description>
</item>
<item>
<title>heroku</title>
<link>https://tomis9.github.io/heroku/</link>
<pubDate>Tue, 13 Aug 2019 19:10:08 +0200</pubDate>
<guid>https://tomis9.github.io/heroku/</guid>
<description>1. What is heroku and why would you use it? heroku is a platform which lets you run your django app on a server for free. Or at least this is how I use it ;)
available at heroku.com
2. Curiosities deployment of your app is very simple. You write yor code, test it locally and push to the repo. The app is automatically deployed, as long as you set heroku to be one of your remote git repos.</description>
</item>
<item>
<title>learning tensorflow</title>
<link>https://tomis9.github.io/learning_tensorflow/</link>
<pubDate>Mon, 05 Aug 2019 18:45:18 +0200</pubDate>
<guid>https://tomis9.github.io/learning_tensorflow/</guid>
<description>The best way to gain intuiton to any new thing you learn is to start from a very beginning and play with it (let’s see what happens if I do this). That’s the power of reinforcement learning ;)
These packages will be useful in the nearest future:
import numpy as np from sklearn.datasets import load_iris from sklearn.linear_model import LinearRegression from sklearn.preprocessing import OneHotEncoder from sklearn.metrics import accuracy_score import tensorflow as tf ## /usr/local/lib/python3.</description>
</item>
<item>
<title>writing a cookbook</title>
<link>https://tomis9.github.io/cookbook/</link>
<pubDate>Fri, 26 Jul 2019 17:58:33 +0200</pubDate>
<guid>https://tomis9.github.io/cookbook/</guid>
<description>1. Why writing a cookbook/blog? In 2016, when I decided to become a data scientist, I was overwhelmed by the number of skills I had to possess to start this sort of career. Reading job offers convinced me that I should be a specialist in:
statistics and mathematics (which I learned at the univerity),
machine learning and data mining (which are fairly easy to learn by yourself if you have a statistical background),</description>
</item>
<item>
<title>tensorflow_serving</title>
<link>https://tomis9.github.io/tensorflow_serving/</link>
<pubDate>Fri, 31 May 2019 15:29:17 +0200</pubDate>
<guid>https://tomis9.github.io/tensorflow_serving/</guid>
<description>#!/usr/bin/python3 import tensorflow as tf from tensorflow.python.saved_model import builder as saved_model_builder from tensorflow.python.saved_model import signature_constants from tensorflow.python.saved_model import signature_def_utils from tensorflow.python.saved_model import tag_constants from tensorflow.python.saved_model.utils import build_tensor_info placeholder_name = 'a' operation_name = 'add' a = tf.placeholder(tf.int32, name=placeholder_name) b = tf.constant(10) # This is our model add = tf.add(a, b, name=operation_name) with tf.Session() as sess: # super complicated model ten_plus_two = sess.run(add, feed_dict={a: 2}) print('10 + 2 = {}'.format(ten_plus_two)) # od tego momentu robimy wszystko, zeby zapisac model # inputy i outputy chcemy przetworzyc do zapisywalnego formatu # najpierw robimy z nich tensory a_tensor = sess.</description>
</item>
<item>
<title>model validation</title>
<link>https://tomis9.github.io/validation/</link>
<pubDate>Thu, 23 May 2019 12:46:03 +0200</pubDate>
<guid>https://tomis9.github.io/validation/</guid>
<description>1. What is model validation and why would you do it? You learned your model and naturally you are wondering how good it is. There are several ways to find out, like measuring the accuracy of predictions, but you may also want to check where exactly particular predictions come from, as this is far from obvious for “black box” models.
2. Examples cross-validation confusion matrix https://www.rdocumentation.org/packages/caret/versions/3.45/topics/confusionMatrix
Confusion matrix is confusing at all as the name may suggest.</description>
</item>
<item>
<title>useful processing</title>
<link>https://tomis9.github.io/useful_processing/</link>
<pubDate>Fri, 17 May 2019 15:44:38 +0200</pubDate>
<guid>https://tomis9.github.io/useful_processing/</guid>
<description>1. What is useful processing? many machine learning algorithms require the same kinds of data preprocessing in order for them to work properly. In other words, theses kinds of processing are useful. 2. Examples one-hot encoding R
# data.table dt_iris &lt;- data.table::as.data.table(iris) mltools::one_hot(dt_iris) ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species_setosa ## 1: 5.1 3.5 1.4 0.2 1 ## 2: 4.9 3.0 1.4 0.2 1 ## 3: 4.7 3.</description>
</item>
<item>
<title>basic machine learning algorithms</title>
<link>https://tomis9.github.io/ml/</link>
<pubDate>Mon, 22 Apr 2019 18:05:21 +0200</pubDate>
<guid>https://tomis9.github.io/ml/</guid>
<description>1. What is machine learning and why would you use it? it’s a rather complicated, yet beautiful tool for boldly going where no man has gone before.
in other words, it enables you to extract valuable information from data.
2. Examples of the most popular machine learning algorithms in Python and R We’ll be working on iris dataset, which is easily available in Python (from sklearn import datasets; datasets.</description>
</item>
<item>
<title>featuretools</title>
<link>https://tomis9.github.io/featuretools/</link>
<pubDate>Thu, 07 Feb 2019 11:27:29 +0100</pubDate>
<guid>https://tomis9.github.io/featuretools/</guid>
<description>This is not a proper blog post yet, just my notes. featuretools (TODO)
https://blog.featurelabs.com/deep-feature-synthesis/
https://towardsdatascience.com/why-automated-feature-engineering-will-change-the-way-you-do-machine-learning-5c15bf188b96
https://docs.featuretools.com/#minute-quick-start
https://blog.featurelabs.com/featuretools-on-spark-2/</description>
</item>
<item>
<title>cassandra</title>
<link>https://tomis9.github.io/cassandra/</link>
<pubDate>Tue, 05 Feb 2019 17:33:24 +0100</pubDate>
<guid>https://tomis9.github.io/cassandra/</guid>
<description>This is not a proper blog post yet, just my notes. cassandra (TODO)
Installation is described on the cassandra webpage, but it didn&rsquo;t work for me, so I&rsquo;ve decided to use cassandra&rsquo;s docker image.
This is a very nice tutorial of setting up a 3-node cluster of cassandra od docker, creating a simple table, populating it with data and retrieving results.
Looks like a decent example of using cassandra on (docker with python)[https://mannekentech.</description>
</item>
<item>
<title>passing arguments to scripts</title>
<link>https://tomis9.github.io/passing_arguments/</link>
<pubDate>Tue, 05 Feb 2019 09:34:23 +0100</pubDate>
<guid>https://tomis9.github.io/passing_arguments/</guid>
<description>This is not a proper blog post yet, just my notes. passing arguments (TODO)
caffee
R A good article on passing arguments to R scripts
You can read the docs of commandArgs for more info, but the general use is very simple:
args &lt;- commandArgs(trailingOnly = TRUE) print(args) # args is a vector of values Rscript file.R one two 3 If trailingOnly is set to FALSE, args will contain some other argument values, e.</description>
</item>
<item>
<title>vim vs emacs - text mining to settle the editor war</title>
<link>https://tomis9.github.io/vim_vs_emacs/</link>
<pubDate>Sat, 26 Jan 2019 14:12:24 +0100</pubDate>
<guid>https://tomis9.github.io/vim_vs_emacs/</guid>
<description>This project is not finished and will be developed over time.
0. Why I’m making this project I want to try out new programming and statistical tools that I read or heard about (tidytext, nlp, laser) on a subject that I find interesting (text editors).
If somebody asks me if vim is a good choice for a text editor, I want to have objective arguments to prove my statement.</description>
</item>
<item>
<title>pandas</title>
<link>https://tomis9.github.io/pandas/</link>
<pubDate>Fri, 25 Jan 2019 13:46:12 +0100</pubDate>
<guid>https://tomis9.github.io/pandas/</guid>
<description>This is not a proper blog post yet, just my notes. pandas (TODO) 1. What is pandas and why would you use it? pandas is a Python package created for working with tables known as DataFrames;
it is the only reasonable Python package for this purpose, which makes Python a little modest comparing to R (base, data.table, dplyr - every one of them has a better interface than pandas) when we process tables;</description>
</item>
<item>
<title>decision trees</title>
<link>https://tomis9.github.io/decision_trees/</link>
<pubDate>Fri, 11 Jan 2019 21:18:57 +0100</pubDate>
<guid>https://tomis9.github.io/decision_trees/</guid>
<description>1. What are decision trees and why would you use them? decision trees are among the most popular classification algorithms;
they divide the dataset hierarchically starting from the full dataset, until the stop criterium is met, e.g. the minimum size of a leaf and the purity of leaf;
in general they are easy to understand, interpret and visualise
however they are not very efficient, but they can scale, i.</description>
</item>
<item>
<title>caffee</title>
<link>https://tomis9.github.io/caffee/</link>
<pubDate>Sun, 06 Jan 2019 19:44:53 +0100</pubDate>
<guid>https://tomis9.github.io/caffee/</guid>
<description>This is not a proper blog post yet, just my notes. caffee (TODO)
caffee</description>
</item>
<item>
<title>theano</title>
<link>https://tomis9.github.io/theano/</link>
<pubDate>Sun, 06 Jan 2019 19:36:46 +0100</pubDate>
<guid>https://tomis9.github.io/theano/</guid>
<description>This is not a proper blog post yet, just my notes. theano (TODO)
theano vs tensorflow</description>
</item>
<item>
<title>pytorch</title>
<link>https://tomis9.github.io/pytorch/</link>
<pubDate>Sun, 06 Jan 2019 17:44:32 +0100</pubDate>
<guid>https://tomis9.github.io/pytorch/</guid>
<description>1. What is pytorch and why would you use it? pytorch is a python package which makes learning deep neaural networks relatively easy and fast
it’s main “rival” is tensorflow, as pytorch was released by Facebook, but tensorflow by Google
2. “Hello world” example inspired by this article
Let’s define some data:
import numpy as np import pandas as pd import matplotlib.pyplot as plt theta = 2 x = np.</description>
</item>
<item>
<title>tensorflow</title>
<link>https://tomis9.github.io/tensorflow/</link>
<pubDate>Tue, 01 Jan 2019 18:17:38 +0100</pubDate>
<guid>https://tomis9.github.io/tensorflow/</guid>
<description>1. What is tensorflow and why would you use it? tensorflow is a machine learning framework
which has APIs to Python, C++ and R
and let&rsquo;s you evaluate any machine learning algorithm, especially deep learning:
quickly (all the computations are performed in C++)
easily - you can view your results in a GUI - Tensorboard
on a huge amount of data, as tensorflow scales easily to many machines and can even make use of GPU.</description>
</item>
<item>
<title>rstanarm</title>
<link>https://tomis9.github.io/rstanarm/</link>
<pubDate>Sun, 23 Dec 2018 20:19:18 +0100</pubDate>
<guid>https://tomis9.github.io/rstanarm/</guid>
<description>1. What is rstanarm and why would you use it? it&rsquo;s an R interface to stan
it&rsquo;s better than rstan, because (according to rstanarm webpage)
models are specified with formula syntax,
data is provided as a data frame, and
additional arguments are available to specify priors.
in a nutshell, rstanarm let&rsquo;s you estimate various Bayesian models and examine them with shinystan.
2.</description>
</item>
<item>
<title>nlp</title>
<link>https://tomis9.github.io/nlp/</link>
<pubDate>Sun, 23 Dec 2018 15:44:02 +0100</pubDate>
<guid>https://tomis9.github.io/nlp/</guid>
<description>This is not a proper blog post yet. nlp (TODO)
NLP (natural language processing) Tutorials: Tutorial describing basic tools and techniques used in NLP: interpreting text as a group of characters/words/sentences and so on;
tokenisation;
token normalisation (stemming, lemmatisation).
Or this is actually beginnig of a playlist.
Example NLP usage in Python link to the article
Text Mining Text mining in R - basic concepts in text mining, introduction to tidytext package and LDA</description>
</item>
<item>
<title>rocker</title>
<link>https://tomis9.github.io/rocker/</link>
<pubDate>Sun, 16 Dec 2018 15:47:35 +0100</pubDate>
<guid>https://tomis9.github.io/rocker/</guid>
<description>1. What is rocker and why would you use it? rocker is docker container specially prepared for working with R programming language;
it is useful if your R model is a part of a microservice system based on docker containers;
you can run R/shiny-server/rstudio-server on any machine you want. The only requirement is docker.
2. Rocker versions An official site of rocker proposes a few versions of r images.</description>
</item>
<item>
<title>hadoop</title>
<link>https://tomis9.github.io/hadoop/</link>
<pubDate>Tue, 04 Dec 2018 21:32:35 +0100</pubDate>
<guid>https://tomis9.github.io/hadoop/</guid>
<description>1. What is hadoop and why would you use it? hadoop is the first popular big data tool ever;
it let&rsquo;s you &ldquo;quickly&rdquo; compute huge amounts of data thanks to dividing computation into many machines (a cluster of machines); &ldquo;quickly&rdquo; comparing to a standard, one-machine approach;
you can store and easily access huge amounts of data thanks to hadoop&rsquo;s distributed file system (hdfs);
in general, hadoop is a cornerstone of big data.</description>
</item>
<item>
<title>git</title>
<link>https://tomis9.github.io/git/</link>
<pubDate>Mon, 03 Dec 2018 15:01:35 +0100</pubDate>
<guid>https://tomis9.github.io/git/</guid>
<description>1. What is git and why would you use it? Git is a totally basic program if you think seriously about programming. Seriously.
It&rsquo;s a version control system, which makes:
working on the same project with many people simple;
remembers the whole history of the projects, i.e. all it&rsquo;s chages as long as you follow git&rsquo;s discipline
2. A &ldquo;Hello world&rdquo; example As you may have noticed, my posts usually contain a section called &lsquo;A &ldquo;Hello World&rdquo; example&rsquo;, but not this time.</description>
</item>
<item>
<title>spark</title>
<link>https://tomis9.github.io/spark/</link>
<pubDate>Fri, 23 Nov 2018 12:58:49 +0200</pubDate>
<guid>https://tomis9.github.io/spark/</guid>
<description>1. What is spark and why would use use it? Spark is a smooth framework for working with big data, i.e. hdfs;
it can be accessed from Python, R, scala (spark is actually written in scala) and java;
it is probably the most popular big data tool nowadays for data scientists.
2. A few &ldquo;Hello World&rdquo; examples a) pyspark Prerequisites Installation of pyspark In this tutorial we will work on a development python version of spark.</description>
</item>
<item>
<title>Hugo</title>
<link>https://tomis9.github.io/hugo/</link>
<pubDate>Fri, 09 Nov 2018 23:01:35 +0100</pubDate>
<guid>https://tomis9.github.io/hugo/</guid>
<description>This is not a proper blog post yet, just my notes. hugo (TODO)
A short tutorial
A really good introductory film:
And a &hellip;
Clone your favourite theme do /themes
Update config.toml based on the themes&rsquo; website
moving your site to Amazon S3
Why would you do this?
you want to have your own domain, like tomis9.com, not tomis9.github.io
you want to find out what Amazon offers for cloud computing - this is a good training project</description>
</item>
<item>
<title>Rmarkdown total basics</title>
<link>https://tomis9.github.io/file/</link>
<pubDate>Fri, 09 Nov 2018 23:01:35 +0100</pubDate>
<guid>https://tomis9.github.io/file/</guid>
<description>This is not a proper blog post yet, just my notes. Rmarkdown (TODO) Before you begin, check out a very nice cheatsheet available at https://www.rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf
Vim key mappings are rather intuitive, pretty similar to basic r filetype mappings: open up a console window, rmarkdown::render()s this file and shows the result in your favourite browser.
In general, Rmarkdown is just markdown with a possibility to add R chunks of code and execute them.</description>
</item>
<item>
<title>redis</title>
<link>https://tomis9.github.io/redis/</link>
<pubDate>Sun, 21 Oct 2018 20:49:20 +0200</pubDate>
<guid>https://tomis9.github.io/redis/</guid>
<description>This is not a proper blog post yet, just my notes. redis (TODO)
docker pull redis import redis import pandas as pd # https://www.youtube.com/watch?v=Hbt56gFj998 # https://redis-py.readthedocs.io/en/latest/ # open up a redis-server session in redis/src/redis-server redis_host = &quot;localhost&quot; redis_port = 6379 redis_password = &quot;&quot; r = redis.StrictRedis(host=redis_host, port=redis_port, password=redis_password, decode_responses=True) r.flushall() # save data to redis d = {key: str(value) for key, value in zip(list('abcdefghij'), range(10))} for key, value in d.</description>
</item>
<item>
<title>logging</title>
<link>https://tomis9.github.io/logging/</link>
<pubDate>Sat, 20 Oct 2018 00:15:21 +0200</pubDate>
<guid>https://tomis9.github.io/logging/</guid>
<description>1. What is logging and why would you use it? Logging, in general, sends information about the execution of a program to the outside of the program, e.g. to stdout or to a file. Why would that be useful?
You may get the information of how and when the program was executed, e.g. who was using it&rsquo;s functionalities and if all the pieces of your program finished correctly.</description>
</item>
<item>
<title>sqlAlchemy</title>
<link>https://tomis9.github.io/sqlalchemy/</link>
<pubDate>Sat, 25 Aug 2018 16:22:07 +0100</pubDate>
<guid>https://tomis9.github.io/sqlalchemy/</guid>
<description>1. What is sqlAlchemy and why would you use it? sqlAlchemy is a python module that enables you to connect to and use sql databases without writing code in sql;
using sqlAlchemy has several advantages:
you will avoid using long sql strings in your code, which are difficult to read without syntax highlighting (unless you keep you sql queries in separate sql files);
you are not vulnerable to sql injection attacks anymore;</description>
</item>
<item>
<title>gitlab-ci</title>
<link>https://tomis9.github.io/gitlab_ci/</link>
<pubDate>Tue, 21 Aug 2018 17:56:38 +0200</pubDate>
<guid>https://tomis9.github.io/gitlab_ci/</guid>
<description>This is not a proper blog post yet, just my notes. gitlab-ci (TODO)
or gitlab Continuous Integration &amp; Deployment/Continuous Delivery
link to documentation
example of use in python and flask
how aboout installing gitlab locally?
https://medium.com/90seconds/continuous-integration-and-deployment-for-data-pipelines-at-90-seconds-53bf10521ea7</description>
</item>
<item>
<title>kafka</title>
<link>https://tomis9.github.io/kafka/</link>
<pubDate>Sat, 18 Aug 2018 14:33:57 +0200</pubDate>
<guid>https://tomis9.github.io/kafka/</guid>
<description>This is not a proper blog post yet, just my notes. kafka (TODO)
kafka tutorial
A gentle Introduction to Stream Processing
Kafka on docker</description>
</item>
<item>
<title>marathon</title>
<link>https://tomis9.github.io/marathon/</link>
<pubDate>Sat, 18 Aug 2018 14:24:14 +0200</pubDate>
<guid>https://tomis9.github.io/marathon/</guid>
<description>This is not a proper blog post yet, just my notes. marathon (TODO)
marathon If you want to have Marathon up and running, follow this tutorial.</description>
</item>
<item>
<title>vagrant</title>
<link>https://tomis9.github.io/vagrant/</link>
<pubDate>Tue, 14 Aug 2018 14:25:43 +0200</pubDate>
<guid>https://tomis9.github.io/vagrant/</guid>
<description>1. What is vagrant and why would you use it? Vagrant let&rsquo;s you setup and use virtual machines easily and quickly.
You store full configuration of your VM in one text file (Vagrantfile), which makes it easily portable and trackable with git(hub).
Vagrant may be useful for testing new tools and software. It&rsquo;s more convenient than a traditional VM with a full GUI.
2. Installation First of all, check if you have virtualbox installed: type virtualbox and if no box pops up, install virtualbox:</description>
</item>
<item>
<title>airflow</title>
<link>https://tomis9.github.io/airflow/</link>
<pubDate>Tue, 14 Aug 2018 11:51:12 +0200</pubDate>
<guid>https://tomis9.github.io/airflow/</guid>
<description>1. What is airflow and why would you use it? airflow lets you manage your dataflow as a graph (direct acyclic graph or DAG), wchich consists of separate Tasks, and schedule them Wait, you may say, I can do that with cron!
Yes, you can, but with airflow:
you can easily divide your app into smaller tasks and monitor their reliability and execution duration;
the performance is more transparent;</description>
</item>
<item>
<title>flask</title>
<link>https://tomis9.github.io/flask/</link>
<pubDate>Mon, 13 Aug 2018 20:19:17 +0200</pubDate>
<guid>https://tomis9.github.io/flask/</guid>
<description>1. What is flask and why would you use it? flask is a python framwework for creating web applications and apis;
it provides a full and simple support for backend, while you still create the frontend with html+css+javascript.
For production use it is not as popular as Django, as id does not scale that well to huge projects. However in data science you will not create such huge webservices and flask, with it&rsquo;s simplicity, reliability, clearness and great community support is more than enough.</description>
</item>
<item>
<title>pyenv, virtualenv, freeze</title>
<link>https://tomis9.github.io/pyenv/</link>
<pubDate>Sun, 12 Aug 2018 15:32:40 +0100</pubDate>
<guid>https://tomis9.github.io/pyenv/</guid>
<description>1. What are pyenv, virtualenv and freeze and why would you use them? these three Python packages let you install your favourite Python version with your favourite Python packages&rsquo; versions on any machine independently to those already installed on the system; you can even store many Python versions and Python packages&rsquo; versions;
pyenv let&rsquo;s you install any Python version you like;
virtualenv let&rsquo;s you install any Python&rsquo;s package version you like;</description>
</item>
<item>
<title>decorators</title>
<link>https://tomis9.github.io/decorators/</link>
<pubDate>Sun, 12 Aug 2018 15:30:35 +0200</pubDate>
<guid>https://tomis9.github.io/decorators/</guid>
<description>1. What are decorators and why would you use them? decorators in Python are special functions that take a function as an argument and slightly change it’s behaviour, e.g. it’s return value;
you can write your own decorators, which is rather easy (I highly recommend Fluent Python as a reference)
but there are already many useful decorators available in Python.
I am not going to describe here how to write your own decorator as, to be honest, I used them only twice in my career.</description>
</item>
<item>
<title>mesos</title>
<link>https://tomis9.github.io/mesos/</link>
<pubDate>Sun, 12 Aug 2018 15:29:44 +0200</pubDate>
<guid>https://tomis9.github.io/mesos/</guid>
<description>1. What is Mesos and why would you use it? It lets you setup and manage a cluster of machines &ldquo;easily&rdquo; (when you know how to use it, which is not very straightforward).
combined with Marathon it provides you with a nice interface to setup and manage docker containers and even build a whole system based on a microservice architecture.
2. Useful links A quick introduction to Apache Mesos</description>
</item>
<item>
<title>docker</title>
<link>https://tomis9.github.io/docker/</link>
<pubDate>Sun, 12 Aug 2018 15:29:16 +0200</pubDate>
<guid>https://tomis9.github.io/docker/</guid>
<description>1. What is docker and why would you use it? &ldquo;In simpler words, Docker is a tool that allows developers, sys-admins etc. to easily deploy their applications in a sandbox (called containers) to run on the host operating system i.e. Linux.&rdquo; Comparing to Python, it&rsquo;s basically a virtualenv, but for the whole OS. Or this is some sort of a virtual machine;
it&rsquo;s a program, which lets you to encapsulate your software into its own, the most basic &ldquo;OS&rdquo; (known as &ldquo;container&rdquo; in docker&rsquo;s world) and run it on any machine/server, which has docker installed;</description>
</item>
<item>
<title>debugging</title>
<link>https://tomis9.github.io/debugging/</link>
<pubDate>Thu, 15 Feb 2018 15:17:57 +0100</pubDate>
<guid>https://tomis9.github.io/debugging/</guid>
<description>1. What is debugging and why would use use it? According to Wikipedia, “Debugging is the process of finding and resolving defects or problems within a computer program that prevent correct operation of computer software or a system”.
R provides a couple of useful functions, which may be used not only for debugging purposes, but also for testing and even developing code.
All the functions present below are already available within basic packages (no installation required).</description>
</item>
<item>
<title>RMariaDB (former RMySQL)</title>
<link>https://tomis9.github.io/rmariadb/</link>
<pubDate>Mon, 12 Feb 2018 12:17:56 +0100</pubDate>
<guid>https://tomis9.github.io/rmariadb/</guid>
<description>1. What is RMariaDB and why would you use it? RMariaDB is an R package available at CRAN that let’s you connect to various SQL databases easily. Why did I switch from RMySQL to RMariaDB? Because of invalid long integer format when downloading data from a database. A pretty nasty bug in RMySQL, which was resolved in RMariaDB.
2. Usage First you have to create a connection to the database.</description>
</item>
<item>
<title>rTags</title>
<link>https://tomis9.github.io/rtags/</link>
<pubDate>Sun, 11 Feb 2018 13:47:00 +0200</pubDate>
<guid>https://tomis9.github.io/rtags/</guid>
<description>1. What are rTags and why would you use them? rTags let you jump directly to the definition of a function under your cursor. Modern IDE&rsquo;s provide this functionality, so why wouldn&rsquo;t you have it in vim?
2. How to use them? You don&rsquo;t necessarily have to read the articles from #3 (unless you want to understand what you are doing). All you have to do is run :RBuildTags, Nvim-R will create a tags file in your current directory and vim will automatically read this file each time you open any .</description>
</item>
<item>
<title>C in R</title>
<link>https://tomis9.github.io/cinr/</link>
<pubDate>Tue, 06 Feb 2018 22:07:51 +0100</pubDate>
<guid>https://tomis9.github.io/cinr/</guid>
<description>1. Why would you extend R with C language? some parts of your program may run too slowly. One of the possible solutions is to rewrite them into C;
if you create a library and you want it to be extremely fast, you will probably end up writing most of your functions in C.;
it&rsquo;s worth learning even the basic example, as most basic R functions are written in C.</description>
</item>
<item>
<title>packages</title>
<link>https://tomis9.github.io/packages/</link>
<pubDate>Sun, 04 Feb 2018 12:06:07 +0100</pubDate>
<guid>https://tomis9.github.io/packages/</guid>
<description>1. What are R packages and why would you use them? R packaging is a convenient way to store and share your R code.
It lets you incorporate testing with testthat specially prepared tools (you can use testthat without creating a package, but it&rsquo;s slightly more complicated).
It lets you easily list dependencies with packrat. You can also achieve this without using a package.
You can easily version your code.</description>
</item>
<item>
<title>Rcpp</title>
<link>https://tomis9.github.io/rcpp/</link>
<pubDate>Sun, 04 Feb 2018 12:03:14 +0100</pubDate>
<guid>https://tomis9.github.io/rcpp/</guid>
<description>1. What is Rcpp and why would you use it? Rcpp is a R library which let’s you embed C++ code inside your R program.
Useful when you have a bottleneck in your code which makes the execution last forever. In that case you can rewrite in into super-fast C++.
Or you can switch to Python. Or data.table. Or write it in C.
2. A “Hello World” example Example taken from Avanced R by Hadley Wickam (link).</description>
</item>
<item>
<title>testing</title>
<link>https://tomis9.github.io/testing/</link>
<pubDate>Sun, 04 Feb 2018 12:02:23 +0100</pubDate>
<guid>https://tomis9.github.io/testing/</guid>
<description>1. What is testing and why would you use it? testing or test-driven development (TDD) is a discipline, which relies on writing a test for every functionality before creating it;
at first the test will fail, as we have not provided the proper functionality yet. Our goal is to fulfill this functionality, so the test will pass.
In reality you modify your tests as you create the functionality or even write the tests after you are finished writing it.</description>
</item>
<item>
<title>classes - S4</title>
<link>https://tomis9.github.io/classess4/</link>
<pubDate>Sun, 04 Feb 2018 12:00:04 +0100</pubDate>
<guid>https://tomis9.github.io/classess4/</guid>
<description>1. Why would you use OOP in R? Object oriented programming in R is unfortunately rather complicated comparing to Python (which seems to be the only reasonable alternative for data science programming).
However, there are certain cases when OOP may come up helpful:
when you write your own package and you want users to work on an object that your function returns (print, summary, plot);
learning OOP in R is a good investment, because it let’s you understand better how functions like print, plot etc.</description>
</item>
<item>
<title>nls</title>
<link>https://tomis9.github.io/nls/</link>
<pubDate>Sun, 28 Jan 2018 22:52:21 +0100</pubDate>
<guid>https://tomis9.github.io/nls/</guid>
<description>1. What is nls and why would you use it? nls, or nonlinear least squares is a statistical method used to describe a process as a nonlinear function of deterministic variables and a random variable;
you may think of it as an older and stronger brother of linear regression - more robust, powerful, smarter and difficult to understand ;)
2. A “Hello World” example Let’s assume we want to model US population as a function of year.</description>
</item>
<item>
<title>tidyverse</title>
<link>https://tomis9.github.io/tidyverse/</link>
<pubDate>Mon, 11 Dec 2017 14:26:39 +0100</pubDate>
<guid>https://tomis9.github.io/tidyverse/</guid>
<description>1. What is tidyverse and why would you use it? tidyverse is a collection of R packages that make working on data a much nicer experience than using base R;
it consists of tidyr, dplyr, ggplot2, tibble and a few more.
To be honest, I prefer data.table to tidyverse, as it resembles basic R data.frames, is faster, more concise and, IMHO, more SQL-ish. But it takes longer to master and may be more difficult to understand, even your own code after some time.</description>
</item>
<item>
<title>shiny</title>
<link>https://tomis9.github.io/shiny/</link>
<pubDate>Fri, 24 Mar 2017 09:13:23 +0100</pubDate>
<guid>https://tomis9.github.io/shiny/</guid>
<description>1. What is shiny and why would you use it? shiny is an R package that let&rsquo;s you create dynamic web applications without any knowledge of html, css and javascript, php etc. Pure R. Sounds like a dream?
Advantages:
easy to learn the basics;
easy to set up.
Disadvantages:
scalability;
performance;
in order to make the application work the way you want to, you have to involve javascript, html and css.</description>
</item>
<item>
<title>ggplot2</title>
<link>https://tomis9.github.io/ggplot2/</link>
<pubDate>Fri, 24 Mar 2017 09:03:49 +0100</pubDate>
<guid>https://tomis9.github.io/ggplot2/</guid>
<description>1. What is ggplot2 and why would you use it? ggplot2 is an R package which makes creating nice-looking plots easy;
the plots you create are highly customisable;
Once you learn ggplot2, you will not make any production plots using basic R. However, due to it’s verbosity, for simple exploratory analysis I still use basic functions: plot, lines, hist and boxplot.</description>
</item>
<item>
<title>lubridate</title>
<link>https://tomis9.github.io/lubridate/</link>
<pubDate>Fri, 03 Mar 2017 13:46:45 +0100</pubDate>
<guid>https://tomis9.github.io/lubridate/</guid>
<description>1. What is lubridate and why would you use it? it’s an R package that makes working with dates easy;
because in basic, no-frills R working with dates may be a little bit daunting
2. A few “Hello World” examples Load the package
library(lubridate) Convert a string to class Date:
# the base way d &lt;- as.Date(&quot;2017-03-03&quot;) class(d) ## [1] &quot;Date&quot; # the lubridate way d &lt;- ymd(&quot;2017-03-03&quot;) class(d) ## [1] &quot;Date&quot; Extract year, month and day</description>
</item>
<item>
<title>data.table</title>
<link>https://tomis9.github.io/data.table/</link>
<pubDate>Thu, 02 Mar 2017 15:45:27 +0100</pubDate>
<guid>https://tomis9.github.io/data.table/</guid>
<description>1. What is data.table and why would you use it? data.table is an R packge which let’s you work on tabular datasets quickly and easily;
comparing to base R or dplyr it’s significantly faster;
data.table has a concise and SQL-like syntax.
2. Basic functionalities Creating a data.table library(data.table) df &lt;- data.frame(x = c(&quot;b&quot;,&quot;b&quot;,&quot;b&quot;,&quot;a&quot;,&quot;a&quot;), v = rnorm(5)) dt &lt;- data.table(x = c(&quot;b&quot;,&quot;b&quot;,&quot;b&quot;,&quot;a&quot;,&quot;a&quot;), v = rnorm(5)) is exactly the same as creating a data.</description>
</item>
<item>
<title>reshape2</title>
<link>https://tomis9.github.io/reshape2/</link>
<pubDate>Wed, 01 Mar 2017 13:07:39 +0100</pubDate>
<guid>https://tomis9.github.io/reshape2/</guid>
<description>1. What is reshape2 and why would you use it? reshape2 is an R package that let’s you change the shape of any dataframe, i.e. to pivot and to “unpivot”.
Keep in mind that if your favourite R package for dataframes manipulation is data.table, functions dcast and melt are already in this package and work exactly the same as those in reshape2.
2. A “Hello World” example In fact there are only two functions worth mentioning: dcast, which is equivalent to MS Excel pivot table, and melt, which does the opposite or unpivots a table.</description>
</item>
<item>
<title>sqldf</title>
<link>https://tomis9.github.io/sqldf/</link>
<pubDate>Wed, 01 Mar 2017 13:07:39 +0100</pubDate>
<guid>https://tomis9.github.io/sqldf/</guid>
<description>1. What is sqldf and why would you use it? sqldf package lets you treat any data.frame object as an sql table. You can write queries as if you were in a database. Pretty useless, comparing to, say, data.table or dplyr + tidyverse.
Despite it’s uselessness, it works like a charm.
2. A few basic examples: Load the package:
library(sqldf) ## Warning: no DISPLAY variable so Tk is not available Selecting specific columns:</description>
</item>
</channel>
</rss>