Why

For some reason, certain podcast are not accessible in China. Thus for good audio RSS content that I am afraid of missing out, downloading them and keep it offline is a nice idea.

However, most RSS clients are not tailored for my use case, so instead of using any tools, I did some research and finally was able to download all MP3 from this podcast:

  1. (out of date) https://raw.githubusercontent.com/Reyshawn/FanpieFilmFeed/master/fanPieFilm.rss
  2. (up to date)https://anchor.fm/fanpaifilm
  3. (马后炮节目) https://rss.shawnxli.com/fanpie-film-ma
  4. Latest: https://www.listennotes.com/zh-hans/podcasts/%E5%8F%8D%E6%B4%BE%E5%BD%B1%E8%AF%84%E8%80%B3%E6%97%81%E9%A3%8E-%E6%B3%A2%E7%B1%B3%E5%92%8C%E4%BB%96%E7%9A%84%E6%9C%8B%E5%8F%8B%E4%BB%AC-pRWES8tY-6H/

首次尝试

I did some research on podcatcher softwares, and no luck finding a good one.

Failed.

第二次尝试

Got some help from:

https://www.commandlinefu.com/commands/view/14685/download-files-linked-in-a-rss-feed

And I used this command to exact all MP3 links:

export HTTP_PROXY=http://127.0.0.1:58591; export HTTPS_PROXY=http://127.0.0.1:58591; export ALL_PROXY=socks5://127.0.0.1:51837

curl -s https://raw.githubusercontent.com/Reyshawn/FanpieFilmFeed/master/fanPieFilm.rss | grep -o '<enclosure url="[^"]*' | grep -o '[^"]*$'

Here are the results: Kaolafm content

Simply dump those links into IDM softwares and the rest are set.

问题

I got a bunch of files with naming like these:

1
2
82013211-c782-4a84-ad20-9001d1bdcd2e.mp3
4c6a181e-f8ff-4b9e-9c20-4a05e8edac84.mp3

No timestamp, no title, no good.

第三次尝试

I found this blog super useful:

https://rakhesh.com/coding/downloading-all-episodes-of-a-podcast/

Thus I wrote the following rss_script.sh:

for i in $(curl -s https://raw.githubusercontent.com/Reyshawn/FanpieFilmFeed/master/fanPieFilm.rss | grep -o '<enclosure url="[^"]*' | grep -o '[^"]*$'); do
    url=$i
    audiodir=$(echo $i | sed 's|http://image\.kaolafm\.net/mz/audios/||' | sed 's/\/.*\.mp3//')
    outfile=$(echo $i | sed 's|http://image\.kaolafm\.net/mz/audios/||')
    mkdir $audiodir
    wget -q $url -O $outfile
done

Simply:

bash rss_script.sh

and you shall have a nice time. Enjoy.

结果

All MP3s are in sub-folders like ‘201604’ or ‘202112’, however, I wasn’t able to get the right title this time.

第四次尝试

最终我发现,还是通过多项选择,表格,curl 的方法来弄比较好。

至于如何多行搜索选择,可以看这里

这里拿 https://anchor.fm/s/4a4df770/podcast/rss 这个link 的下载过程,简单记录一下:

  1. Extract title and mp3 links: title, mp3
  2. url 中有一些特殊字符需要处理:
    1. %2F 改为 /
    2. %3A 改为 :
    3. https:.*https 改为 https
  3. Put them together in a csv file (maybe use Excel): the csv file
  4. Make a curl bash and another example
  5. Execute the curl
  6. Done

注意:在mac上,不要用 iTerm,而是用 Terminal,否则 copy special character 会出错。

Reference

sed tutorial: https://www.digitalocean.com/community/tutorials/the-basics-of-using-the-sed-stream-editor-to-manipulate-text-in-linux

Old content: here

New content: here