/tech/ - Tech

Technology.

catalog
Mode: Reply
Name
E-mail
Subject
Message

Max message length: 8192

Files

Max file size: 80.00 MB

Max files: 5

Password

(used to delete files and postings)

Misc

Remember to follow the rules


Lets Make a Imageboard - /hobby/ edition Anonymous Comrade 10/31/2019 (Thu) 22:28:28 No. 2315
With the official locking of >>>/tech/ I'm moving my development log thread here. Feel free to ask questions, make comments, or pass on your suggestions. Much of the template is now complete, the CSS is responsive, usable on every browser that has rem support, and only 3815 bytes. There is still some work to be done which I'll mention in a reply following this. As it stands the plan is to start programming the back-end in OCaml this Monday leaving the finishing touches in the front-end for a later date.

front-end principles
The site should be minimalist. Adding elements unnecessarily makes everything more difficult to implement for and slower.

The site should be responsive. I don't own a device with a touchscreen, but they are omnipresent, and there is no reason that you shouldn't be able to use a website with your touchscreen device well.

The site should be compatible. I'd like to see support for more or less every desktop platform since ~2000 (this is easily done with MyPal PaleMoon, and TenFourFox) and more or less every "smart device" ever.

specifics
Text formatting should be subtle and compose as expected. There should be nothing that would feel be out of place in a book or article.

Web-fonts should not be used. Web-fonts don't improve the usability of the site but increase bandwidth usage and hinder performance for users.

JavaScript should only be used for progressive enhancements and as much of the site as possible should work without it. A Complete list of features which will require JavaScript are: automatic copying of text into the reply form, automatic refreshing of the page along with title notifications, and moving the second reply form.
>>2315
One of the current concerns for the front-end is how to deal with spoilers and emojis. Coloring emojis even transparent is only implemented in the most recent versions of Safari and Chrome. Apparently a sound way to deal with this is to swap all the emojis for SVG images, and maybe limit the number you can post per reply. Twemoji is one source of these SVGs. So the options are removal or SVGs, neither of which seem unreasonable to me.

Another question is what should be done in the case of nested quotes ">". In the other thread a comrade said that they would like to see these rendered as quotes but you do actually introduce some ambiguity with this because you're effectively overloading ">>" and ">>>". That means that if you're quoting someone two replies back who starts their sentence with a number or three replies back and starts their sentence with a board name it's not clear how that should be rendered. It seems that quite often you would want it to be the way it is not regardless of which way you set it. I'll have to think more about the cost benefit.

As I mentioned I'm thinking about focusing on the back-end when I return to work this Monday. Here's a list of the things that still need to be done on the front-end for when I return to this though: When flex-wrap isn't available activate overflow-y:auto; for navigation. Add reporting/deleting posts. Check if whited00r has a browser with rem support. Add moderation tooling. Consider "desktop first" CSS because all mobile devices have media query support. Setup a second position: fixed; form for easier posting. Write the JavaScript features mentioned in the OP.
>>2316
Another interesting question is how should nested spoilers be handled? It seems clear that ideally they would simply apply a spoiler multiple times so that when hovering over the un-nested portion of the spoiler it would show only the un-nested portion of the spoiler and when hovering over the nested portion(s) it would show everything in scope.

The bad thing about this is that it requires for closing and ending tags to be distinguishable. If this was not the case you'd introduce some ambiguity. The character count can still be left the same per tag though, and there is nice existing syntax for this sort of thing such as "/*This could be a spoiler*/" as used in a number of programming language for comments.

Attached is the dark theme, each theme is only 10 lines of custom properties making it very easy to make your own through user-scripts or for more to be added to the site.
>>2317
Something which I'd be interested to get feedback on is the idea of post validation. The idea being that if your post has a error in it which could be caught at parse time, such as a reference to a post that doesn't exist, or a opening tag without a closing tag, or a invalid URL, this would signal a error to the user so that they could fix it. This wouldn't require any JavaScript, and the main downside I see is in the effect to certain fun-posting activities such as when making a short reply to a fast moving thread where a formatting error doesn't really matter at all, or to things like trying to make GETs where failed attempts would be removed.
>>2318
seems like itd be more useful than annoying. my dream feature would be having functioning footnotes[1]

[1]not idea if its viable or not
>>2319
>Seems like it'd be more useful than annoying.
You might be correct. Keep in mind it would only ever fire when you actually messed up.

>My dream feature would be having functioning footnotes.
That's completely possible, but inline with the minimalism principle of the front-end I'd prefer to not have it. I just removed strike-through actually with the hopes of reducing the amount of text formatting.
>>2318
The last thing that comes to mind in terms of features for the parser is potentially to implement "quote optimization" which some textboards have if I understand correctly. This would be something like allowing a notation such as ">>74-80" to refer to the seven posts bound by the range inclusive, and ">>78,90" which would just refer to those two posts. The way I'm aware of to display such a thing without JavaScript is to effectively generate a new page every time one of these combinations is created, and have the link go to this page. Even with JavaScript it seems it would be a bit awkward to have the posts show up while hovering over the link. Can you think of any better way to display these things? I'd like to have this feature if something reasonable can be come up with.

Actually a possibility would be to have ">>78,90" operate as a macro for ">>78\n>>90" and a similar expansion for ">>74-80". This would be a decent enhancement for users without JavaScript so that they wouldn't have to type so many ">>"s otherwise you'd just use the automatic placement with JavaScript I imagine. Thoughts?
>>2321
Can you stop bumping your shitty thread about your gay imageboard no one cares about?
>>2320
I'm interested in feedback, and there are far worse, more active threads. If you don't like this thread press the hide button on the OP or if you don't like >>>/hobby/ consider not using the overboard etc. (sage because off topic)
>>2321
>effectively generate a new page
Isn't it possible to have clickable things that collapse parts of a page just with CSS?
>>2324
>Isn't it possible to have clickable things that collapse parts of a page just with CSS?
Yah, you can do that and I am in several parts of the site. It's a possibility, the issue in this case is that to do this you would need to copy the HTML of every post to everywhere it is quoted and this would quickly increase the size of pages and is a little ugly. Maybe it would be worth it though, I unfairly ruled it out.
>>2325
Searching the idpol thread as a example of a full thread reveals it has only slightly more quotes than posts so if that's a example and the HTML of that page is 533944 bytes with ~80KB used for something other than the posts we can guess the size after applying this would be ~900KB. Uncompressed this is equivalent to the thumbnails in the thread. Gzip would help a good bit with this though, the original HTML for the posts gzipped is ~75KB and when copy pasting it into its self is still only ~150KB (as expected). I might consult a sympathetic guru for their thoughts on this. A downside independent of size is that the site would be effectively unusable in browsers that don't use CSS such as W3M, Links, EWW, etc.
>>2324
>Isn't it possible to have clickable things that collapse parts of a page just with CSS?
I was thinking about adding content not subtracting it in my last post sorry about that. The trick with this is that you would need to dynamically generate CSS to hide things based on ID.

Perhaps this might also have been what you were thinking but what we could do is expand ">>74-80" into ">>74,75,76,77,78,79,90" on :hover while having only the individual numbers as links. I actually really like this idea and I think I'll implement it.

Something text-boards do to make this notation more useful is to start the replies to each thread with 0 and increment only in the thread, while threads are assigned a number incrementing from the start of the board. So to reference a range in another thread you'd type something like ">>/213/30-40" where 213 is the thread on the current board and 30-40 are the posts you'd like to reference. I think this can be extended while reducing the number of text formatting elements to include boards. For example to reference posts 1 through 5 and 10 on the 12th thread on /q/ would be written as ">>/q/12/1-5,10" and would be expanded to ">>/q/12/1,/q/12/2 ..." on :hover following the scheme laid out above. You can do likewise with boards ">>/q/" and threads ">>/8-12,3/" so there is no longer a need for ">>>".
I did some work last night on the CSS because I was a little too sleepy to work on the back-end. I've now added the delete/report functionality to the front-end along with adding multiple files and a catalog with the nice feature where you have to click on the threads to scroll them like on vichan derivatives. All these things work without javascript which is atypical. I also removed flexbox entirely using a scrolling off canvas menu instead I think this was in general a positive thing. Along with some optimizations done in some of my time off over the week despite the increased functionality we're now at 3434 bytes, It seems reasonable to attempt to stay within 4kB for the site's CSS.

Currently I'm trying to make sure that I have the back-end stack that I actually want. The OCaml ecosystem is so fragmented that it's often difficult to select things.
>>2328
You're doing Gods work anon. I hope to use it soon.
>>2329
>You're doing Gods work anon. I hope to use it soon.
Thanks man, that means a lot. I did underestimate the time it would take. When I started I said 3 weeks of very part time work but it looks like it's going to be close to double if not triple that.

Anyway I got my environment cleaned up and selected the stack today, but it took far more time than I would have liked. I decided against using a web-framework, the applications I plan on using are: Haproxy, Memcached, and MariaDB, and the libraries I plan on using are: http/af, Tyxml, Caqti, Lwt, CamlImages, and Angstrom. Beyond this all I was able to get done on the project today was to start some of the HTML generation in Tyxml. I plan on making pretty aggressive use of partial evaluation to make the HTML generation faster, which is a bit fun.
>>2330
can you add used created boards as a feature?
>>2331
>Can you add user created boards as a feature?
I might be able to add this in the future for other hosts. On the imageboard I'll be hosting this isn't something I'm interested in. I'm going to take a quality over quantity approach.
>>2332
so will this be a proprietary chan or as open source software?
>>2330
>MariaDB
why not postgres?
>>2333
>so will this be a proprietary chan or as open source software?
Sorry, I went to sleep, it will be AGPLv3 hopefully with some easy means of anonymously contributing to the project. The source will be hosted on the same host as the imageboard, so I will not be releasing the source until it's in beta.

>>2334
>why not postgres?
I'm not a database expert, but from what I understand MariaDB is faster without compromising on security and because I'm going to be using Caqti other hosts can use either PostgresSQL without any change to the program.
>>2335
I'm glad someone is doing this, it would be great to get bunkerchan off lynxchan as well, IMO lynxchan blows.
>>2335
do you already have a domain for this? is it going to be another leftist chan?
>>2336
>I'm glad someone is doing this, it would be great to get bunkerchan off lynxchan as well, IMO lynxchan blows.
Honestly, the gripes most people have with it is just with the front-end, which can easily be changed, and Stefan said that they had "30x performance" of PHP based boards. I've got no idea if Stefan's claims are true; https://endchan.org had persistent issues with their database melting down for the longest time and Appleman over at https://lainchan.org has claimed a migration would be a regression in both performance and features. I'm mostly interested in writing my own so that I can have enough motivation to moderate the site properly once it's running if that makes any sense, but also because I think I might be able to write something really exceptional. I would be very happy if bunkerchan or other imageboards migrated over but it's going to take some time before it has proven its self worthy of this.

>>2337
>Do you already have a domain for this?
No domain, host, or name yet. Just the user-facing side of the front-end and a little bit of the back-end. I'll be taking advice on all three of these things here once we get to that stage.

>Is it going to be another leftist chan?
It'll have a leftist theme, although at least in the beginning I don't want to host a politics board. They are notoriously difficult to moderate, and there is already one here anyway. I think to start it's going to be just /g/ and after that I'll expand more or less following Futatsu's line: "an imageboard intended to promote and discuss all different kinds of self-made digital content! Come and discuss best practices for programming, drawing, and more, get feedback on your work-in-progress, or show off your latest creation."
I polished off the static portion of the HTML generation today for all the pages, and started on the parser combinators to generate the dynamic HTML from the messages. Not that much to show off so here's a elegant function to generate the navigation bar. Anyway it's looking like the rest of the week is going to be quite busy for me, so while improvements on the parser is the type of thing I feel I'd normally be able to make some progress on throughout the week I have some doubt.
(1.69 KB 145x125 ideaguy.png)
idea guy returns:
how about options for placing uploaded images within the text? I'm imagining something like a thread where every reply looks like a wordpress blog. The only exception being the OP, as its likely essential to being a chan that the OP has a picture in the top left, but why not let replies have full control over where the picture they upload go?
>>2340
there are only three positions for a wordpress picture, left aligned, right aligned, and center, with the text wrapping around it in various forms. im not sure what this adds, especially since the whole point of responsive was to make it work on small screens (mobile) where the image alignment wont even matter because any image is going to take the full width of the screen regardless of alignment
>>2340
plus, if he does that you might as well say well let people do any kind of styling they want, hell have them write in markdown and make him write a MD parser while youre at it
>>2340
>How about options for placing uploaded images within the text?
I'm not super interested in adding features to the text formatting unless they make things more orthogonal, ergonomic, or intuitive. The only thing I've been tempted to add other than these things is some sort of server-side latex formatting (I would use a preexisting solution) for mathematical equations because you can't really write equations properly otherwise. In case you're curious what follows is what I intend on being the markup language to be used, although I won't re-describe >>2327:
bold
italics
/*spoilers*/
```pre-formatted```
>quotes
scheme://valid-url.tld
>>2343
There is some overlap with the formatting here, my bad. Italics will be wrapped with "~~" and bold will be wrapped with "__".
>>2343
>>2344
Feedback would be great on these things by the way. Would the more traditional notation for bold "**" be preferable, and what do you think of "~~" for italics. I know some sites such as this one use this for de-emphasis of some sort. Is there any worry that "```" will show up in source code anywhere?
>>2316
>>2345
If you do nested quotes that reduce contrast with each deeper level, you automatically also have a spoiler function, hmmm.
>Would the more traditional notation for bold "**" be preferable, and what do you think of "~~" for italics.
Slightly better than how it's here IMHO.

The most direct way to tell people how formatting works would be a formatting code that does not vanish in the published post, e.g. putting everything in quotation marks turning into italics with the quotation marks staying, an exclamation mark and the word right before it turning bold, stuff like that.
>>2346
>If you do nested quotes that reduce contrast with each deeper level, you automatically also have a spoiler function.
Typically quotes are not delimited so you can nest, but only if your intention is to spoiler everything until the end of line. I haven't ruled out the idea of nested quotes yet.

The ambiguity can be gotten around quite easily by changing either the syntax for linking (for example to @) or quoting. Then my only concern would be that you'd have to have a separate class for each level and each would have to have there own span. That takes up a bit of space but nothing to worry about too much. Really I guess the only question is if this is desirable? I know you're a little passionate about it but I haven't decided myself.

>Slightly better than how it's here IMHO.
That's good to hear! Does this mean you prefer the traditional "**" to "__" for bold though? I'll likely go ahead and pick whichever you prefer so long as there isn't opposition from others.

>The most direct way to tell people how formatting works would be a formatting code that does not vanish in the published post.
WYSIWYG is generally quite nice, I agree. To some extent it's a bit tricky to do nicely in a markup language once the number of elements gets some what large though. For example your syntax choice for bold would rule out having a post with multiple explanation points in it because everything in between would be accidentally made bold. Your syntax for italics would provided nearly all the same meaning except for when you want subtle emphasis without looking sarcastic.
>>2347
Actually the way I plan on making things work is that having multiple ">" would still be made a quote but precedence is always given to links if what follows ">>" is a valid link. Semantically that means quotes don't nest, but it does sort of give that effect without guarantees. In fact I might need to modify the syntax just to avoid ambiguity here anyway...
>>2348
This is actually sort of odd, it's only ambiguous if you interpret multiple ">" that aren't a link as nested quotes, otherwise there isn't any ambiguity. Unfortunately because of the way it's displayed it can be interpreted as that, and there isn't really a unambiguous alternative way to display it.
How often does it happen on imageboards that one actually uses the email field for something else than sage? I have seen it about a dozen times and two times out of three it was by people mistakenly believing it was necessary for posting and not published for the world to see (after all, they didn't see published contact information for the other posts). The sage function could be just a checkbox "bump thread" selected by default.
>>2350
>How often does it happen on imageboards that one actually uses the email field for something else than sage? The sage function could be just a checkbox "bump thread" selected by default.
I agree with this, although my implementation is slightly different. As you can see in the first two screenshots in the thread there isn't a email field, but there isn't a sage button either. The idea is that you would put sage in the subject textbox which would be equivalent. There will likely be some indicator of a post being saged.
>>2350
almost never. its just there by tradition.
random newfag question, what does the checkbox on each post for?
>>2353
>random newfag question, what does the checkbox on each post for?
It's for reporting/deleting without javascript. You can see my implementation here: >>2328 Although this will likely have to be replaced with the more traditional way of putting delete/report in the footer due to how inefficient mine is.
>>2354
the right arrow creates also creates dropdown for report, global report, and delete post?

Is this some sort of legacy reason, because you can now make dropdown menus with pure CSS no js involved, so that kindof seems dated unless you want to have a captcha or something required for reports
>>2355
>Is this some sort of legacy reason, because you can now make dropdown menus with pure CSS no js involved, so that kindof seems dated unless you want to have a captcha or something required for reports
You can make dropdown menus in pure HTML/CSS (I think you always could) but the issue is you have to duplicate the form for reporting into every reply. I did a test and it came out to roughly a extra 150kB uncompressed for a full thread (500 replies). This doesn't negatively effect HTML only browsers like W3M though, so it might be worth it anyway.
>>2356
couldn't you just make it a simple link to a global reporting page? put the post id to be reported in the url (get) and have it pasted on the report page that it gets redirected to. Theres no rule that says you have to do everything from 1 single page, especially if someone is going to a global report (probably something serious like cheese pizza). if youre worried about them losing the page, do target = _blank for a separate tab . that way you don't have to load tons of forms in either case, just 1 on its own dedicated global report page
>>2357
>couldn't you just make it a simple link to a global reporting page? put the post id to be reported in the url (get) and have it pasted on the report page that it gets redirected to. if youre worried about them losing the page, do target = _blank for a separate tab. that way you don't have to load tons of forms in either case, just 1 on its own dedicated global report page
That's a exceptional idea, cheers!
>>2358
hmmm, it does make it more difficult to do any sort of bulk reporting though. Also do you think the indirection would discourage reporting?
>>2359
I didn't think of bulk reporting. I suppose in rela circumstances for the most part its either 1 offending post or a raid? if your database schema is giving a unique ID per THREAD, then you can have multiple options, such as "report post", "report thread", etc. So if user clicks report post that post id gets sent, but for thread, thread id gets sent.

It's more blunt than being able to mass report 'particular' posts on a thread, but presumably if a admin gets the global report they will be looking at the whole thread anyway?
>>2359
i suppose indirection could discourage people from reporting in theory
>>2360
>Presumably if a admin gets the global report they will be looking at the whole thread anyway?
I'm fine with even this being enough to resolve that issue.

>>2361
>i suppose indirection could discourage people from reporting in theory
I wonder if it would be reasonable to just get rid of the report text and simply have a report button of some sort. Potentially asking with a pop-up or something similar if you're sure you want to report this post?
>>2362
>I wonder if it would be reasonable to just get rid of the report text and simply have a report button of some sort. Potentially asking with a pop-up or something similar if you're sure you want to report this post?
I guess this wasn't a productive comment, it just put us back where we started with one less element. Here's a alternative, there is no pop-up but the report can instantly be undone. For example have a report button just submit but on the server side if you receive a second submit for the same name undo the reporting. Deletions are handled in the way you mention.
>>2363
I've thought about this for a moment, and I'm actually going to simply implement your original proposal. I think the faults I was perceiving are not valid. In the future I'll be taking more time to assess things properly before commenting.
>>2364
Here's the implementation, this allowed me to quickly clean up some excess styling as well bringing us down to 3289 bytes in addition to removing the bulky form elements from the thread view. Also the HTML is now classless in true Marxist fashion.
>>2365
looks good
>>2365
btw are you going to dockerize this or write some sort of build script? also might consider replacing haproxy with nginx because on ubuntu hosts its way easier to set up letsencrypt with nginx
>>2366
>looks good
Thanks, honestly I think the reply header looks a little funny now, but I bet whatever system I use for IDs will make it work well. I was thinking about having a randomly selected name from a list (like on 2channel or lainchan) per-user per-thread.

>>2367
>btw are you going to dockerize this or write some sort of build script?
I haven't really thought about making things easy for other hosts yet. Currently I'm just using the OCaml standard build tool Dune. It should be well equipped to pull everything off here, but I doubt I'll want to hoist a full OCaml environment on anyone who wants to host a imageboard.

>also might consider replacing haproxy with nginx because on ubuntu hosts its way easier to set up letsencrypt with nginx
It does seem worse (I'll also be using it for TLS termination, which is probably also worse), but still not that bad if you're using a script or even just examples. The reason I selected Haproxy over Nginx was because my understanding was that you had to give Nginx a couple grand a year to get solid monitoring, while Haproxy had it by default and performance was roughly equivalent. I suppose ideally I'd be able to have pre-built configurations for a number of load balancers like Nginx, Haproxy, and Relayd but that seems like it'd be subject to rot unless there were capable hosts actually using each of these.
>>2368
>Honestly I think the reply header looks a little funny now, but I bet whatever system I use for IDs will make it work well.
This is mostly because some of the optimizations distorted the padding around the reply header actually. I was procrastinating last night so we're down to 3020bytes for the CSS now by the way. Not sure why I didn't procrastinate with the parser instead...
>>2369
I've got a question, what is the point of having two separate views of the imageboard, the classic paginated view and the catalog? which is better?
>>2370
>What is the point of having two separate views of the imageboard, the classic paginated view and the catalog? which is better?
They're useful for different things. On slower boards the paginated view is more useful because changes to the board are likely to be contained on the 1st page, on faster boards the catalog will be more useful because you avoid the indirection of having to click through the pages. There is also the issue of imageboard catalogs being extremely large (in contrast to textboard catalogs) so the page might be slow enough to download on poor connections that you might prefer the paged view. Lastly it lets new-fags interact directly with the content upon arrival instead of having to have the extra indirection, which might be a positive or a negative, likely once again depending on size.
Apparently I'm pretty awful at writing parsers. Attached is a image of the types and the start of the parser. I've got sound implementations of everything but links, paragraphs, plain text, and references. Paragraphs and plain text I have unsound implementations of.
>>2372
i dont know ocaml, is this the part of the program that translates chan formatting marks into html?
Hi noob here. Can someone explain to me in simple language how an imageboard functions. I know http servers read an html/css file and sends it over web through http port. The web browser reads that html and renders it into a webpage that we look at. Can somebody explain to me like this?
>>2373
Yep, well actually it translates the imageboards markup language into a AST and then into HTML. It's really simple, but I think the implicit loops, implicitly handled failures, and subtle syntax are throwing me off a bit.
>>2375
Here's a implementation of the reference system, if we only ever wanted to reference ranges or lists of boards, threads, and posts all with their implicit scope this would be enough, however we often want to specify a explicit scope as described in: >>2327 because of this a little more work has to be done for it to be complete.
>>2376
Attached is the complete implementation. This is pretty nice as I believe this is the first major novel feature to be included in the back-end. This can parse things as complex as >>/testing,g/,/pol/1-20/,20,30,40-45 and I believe it is free of errors and major inefficiencies.
>>2377
Okay, I've added a link parser and decided that the paragraph parser can't really be a parser but instead needs to be a optimization performed on the AST after parsing. This means that all that's left is optimizations and plain text parsing. This might sound a little silly but I don't really know how to handle plain text. Part of this is because I don't necessarily know how to handle a opening or closing tag missing its opening or closing counter part. Should these be deleted, left in plain text, or should they apply to the end of the input?

What do you think?
>>2377
>This can parse things as complex as >>/testing,g/,/pol/1-20/,20,30,40-45 and I believe it is free of errors and major inefficiencies.
Amusingly this example actually failed. Here's a version with a clean up and this bug resolved.
>>2377
>>2378
I don't know Ocaml so i can't tell you tbh. Keep in mind though, the hardest parts of programming an imageboard is probably going to be the more stateful parts like database interaction, file uploads, etc.

If you need any help with the database schema or something i can do that.

Just a shot in the dark, maybe not an elegant solution, but maybe you could do some preprocessing of the input, possibly using a regex, to enclose all text not in tags into a "[plain][/plain]" tags? that way when it goes to your parser you have something to work with?

Also don't forget to sanitize the output for HTML, Javascript. If the query builder youre using doesn't have parameter escaping then you need to worry about sql injection as well
>>2380
>Keep in mind though, the hardest parts of programming an imageboard is probably going to be the more stateful parts like database interaction, file uploads, etc.
That sounds correct, I'm just working (mostly) front to back for wishful thinking / top down purposes.

>If you need any help with the database schema or something i can do that.
That'd be great! I'll certainly have you check things out when I get there then!

>Just a shot in the dark, maybe not an elegant solution, but maybe you could do some preprocessing of the input, possibly using a regex, to enclose all text not in tags into a "[plain][/plain]" tags? that way when it goes to your parser you have something to work with?
I think I can probably pull something a little cleaner off (my question probably wasn't even fair to ask without showing the full source of dispatch). I've been thinking a bit since I asked and I suspect what I have to do is rewrite my dispatch function because the way it works currently it's not possible to sanely parse plain text. Probably won't get to work on things much until next week now though sadly.

>Also don't forget to sanitize the output for HTML, Javascript. If the query builder youre using doesn't have parameter escaping then you need to worry about sql injection as well
I'm pretty lucky to have some exceptional abstractions to keep me from worrying about these things. TyXML which I'm using for HTML generation not only sanitizes but also type checks values so width for example must be passed a int and also checks that the HTML is properly formed. Caqti similarly not only abstracts over theoretically any SQL database (although in reality there is only MySQL/MariaDB, PostgreSQL, and SQLite drivers) but also types checks your entries.
>>2381
>PostgreSQL
technically cockroachDB (google spanner clone) is postgres wire protocol equivalent - you can use postgres driver on it. However its a PITA to set up manually (unless you're using a managed DB, and cockroach is only on google cloud and AWS).
>That'd be great! I'll certainly have you check things out when I get there then!
Awesome.
>>2382
>technically cockroachDB (google spanner clone) is postgres wire protocol equivalent - you can use postgres driver on it. However its a PITA to set up manually (unless you're using a managed DB, and cockroach is only on google cloud and AWS).
That's neat, I didn't know that, I'll have to try that some time. For this project though I think I'd prefer to use officially supported (open source) databases if at all possible and plan on tracking down a host which I can pay for with XMR or https://xmr.to which I believe rules out Besos and Pichai.
Here's what I'm thinking in terms of programmatic content limitations for my host, and likely default for future hosts. Feel free to express any concerns or comments you may have.

All fields will be UTF-8 encoded, all glyphs will be converted into their text variants. Following are the maximum values for various items: 128 posts per thread, and threads per board. 8 threads per page. 4 files per reply. 4Mb total upload size per reply. 4Kb for each message. 64bytes for the password and subject fields. Perhaps a 256byte minimum for thread messages and a 4byte minimum for thread subjects.

(anything with a text encoding), image/bmp, image/gif, image/jpeg, image/png, image/tiff, video/mp4, video/webm, video/ogg, audio/mpeg, audio/wav, audio/ogg, audio/mp4, audio/webm, audio/flac, audio/aac, audio/aacp, and application/pdf
>>2384
>All fields will be UTF-8 encoded, all glyphs will be converted into their text variants.
Just to be clear many UTF-8 glyphs lack text variants, these glyphs will be automatically removed from the input. These non-text variants are generally annoying and don't have consistent meaning across devices anyway so I feel it's no major loss.
>>2377
One last thing, I've actually decided that allowing lists of references at different scopes such as: >>/testing,g/,/pol/1-20/,20,30,40-45 has more loss in reading ergonomics and intuition than it gains in writing ergonomics and orthogonality. Instead I've changed it so that the way this would be written is >>/testing,g/ >>/pol/1-20/ >>20,30,40-45 which is far easier to understand and has neither a significant loss in ergonomics or orthogonality.
any updates?
>>2387
>any updates?
I tend to be quite busy and a tad exhausted every day but Monday and Tuesday. Anything I come up with on other days is just what I can manage to do with extremely limited time and energy, usually just things that don't require too much creativity like refactoring or simple design decisions. The only thing I've had time to do since my last post was start looking at VPS providers, specifically the following are under consideration, likely in order, but there are many I haven't considered at all yet, and many others that were promptly ruled out: https://flokinet.is/ https://openbsd.amsterdam/ https://cockbox.org/

I guess I can post the changes made to the reference code though since I only spoke of it last. Rather insignificant things. I've thought of a sound way to rewrite my dispatch function but have not implemented it yet.
>>2388
https://www.digitalocean.com/pricing/
https://www.digitalocean.com/docs/platform/availability-matrix/

american hosting company, cheap alternative to AWS, has data centers located in several Asian and European locations in addition to north america. Euro locations: Amsterdam, Frankfurt, London

cheapest droplet is $5/€4.5 a month for 1TB transfer, 25GB disk space
>>2388
it also has a built in load balancer for $10 a month and managed DB (mysql) for $15 if you want
>>2388
In case anyone would like to help me find a VPS provider what I'm looking for ideally is a host that: doesn't enforce DMCA, accepts BTC/XMR, has unmetered traffic, is tolerant of Tor, is reasonably affordable, and is based in a sane country (not US, Germany, etc.). It's unlikely that I'll be able to find all of these things from any one provider, and if I did they would likely have some other issue but that's ideal. Of these BTC or XMR, unmetered traffic, and affordability are the only hard requirements.
>>2389
>>2390
Oh man, you posted this as I was replying sorry man. I've heard good things about digital ocean, but I don't think they manage to meet my requirements unfortunately. I'll give them another look though.
>>2391
>doesn't enforce DMCA, accepts BTC/XMR, has unmetered traffic, is tolerant of Tor,
it's going to be difficult to find a single country in the world that tolerates all that.
>>2393
>it's going to be difficult to find a single country in the world that tolerates all that.
The difficulty seems to be that there is little overlap between unmetered VPS providers and the rest of my requests. For example https://www.orangewebsite.com accepts BTC, doesn't enforce DMCA, is supportive of Tor, and has a untarnished reputation but is rather expensive and has rather small traffic limitations. https://flokinet.is/ does similarly although with a slightly more tarnish reputation, affordable prices, and larger traffic caps. It might be that I have to settle for metered traffic, which isn't the end of the world.
>>2394
once you finish it, is there a chance you can upload the source code to github one time so others can see/use it? its a lot more accessible than hosting it on a custom site
(2.38 KB 318x118 slider-captcha.png)
>>2395
BTW i know you haven't thought of it yet but with regards to spam filters:

CAPTCHAS are outdated. ML/CV has gotten too good to the point where computers can solve it better than humans. They're also annoying

The upside is that typically bots don't view the website visually, they just look at the forms in HTML format and submit that so you have two options:

1. make a checkbox thats generated at runtime by javascript (bots dont execute javascript) that way if someone submits without that checked, it discards the input. This can be an actual checkbox or something like a "slider". The downside of this is that it requires javascript and your javascript blocking users will be filtered

2. Honeypot method. Bots typically fill out all the fields on a form. You can add several "fake fields" to the form that have real sounding names but are hidden from the user. Anybody filling them out therefore is a bot. EX:

<input type="text" name="website" style="display:none !important" tabindex="-1" autocomplete="off" value="">

you can put a few more fields with names like 'phone' 'fax' 'b_password' , etc. if you want. use display:none instead of input 'hidden', hidden forms will likely be programmatically detected by the bot. Also make sure that autocomplete="off" so the human user doesn't accidentally autofill in the hidden form and get spam filtered out.

Have to add this IN ADDITION to regular CSRF ensuring they can only submit the form from your site/page.

This isn't perfect but gets rid of 99.9% of automated spam bots for small sites
>>2396
it works even better if you make the form conform to some sort of HTML5 pattern matching or phone type / email type etc.

https://www.sitepoint.com/client-side-form-validation-html5/
>>2396
a final thing you can try as well is adding timestamp validation. Add a hidden field with a timestamp of now to each form. Make the user wait at least 3 seconds to post (the vast majority of form posts under 3 seconds are spam). finally if you really want to go DEMIGOD MODE, find a way to programmatically vary the names of the fake fields and their position on the page (prevents bots from remembering the position/name of the field that caused them to fail). god mode bonus: auto IP ban bots. caught using this method (small amount of time like a day)
>>2395
>once you finish it, is there a chance you can upload the source code to github one time so others can see/use it? its a lot more accessible than hosting it on a custom site
I'd be perfectly fine with someone else doing this, but I have no intention to. I might eventually host a https://sourcehut.org/ instance for community members, and the site's source though if that's any consolation.

>>2396
>>2397
I agree that captchas seem a little dated. One is a bit out of the question because one of the principles of this imageboard is better support for individuals without javascript. On two I have heard good things about honeypot fields, but didn't know all the details so thanks for that. Reducing moderator load is going to be something I strive for, although I'll only be doing this in earnest once it becomes a issue, I would like to have the mechanisms set up in advance however.

Something else I'd like to do in this domain is server side flood detection. For example you could make sure that no individual user posts at a speed beyond some threshold, or make sure that any two posts are not too similar within a time frame, or check to make sure that a certain image isn't uploaded within some time frame.
>>2398
This is good stuff, cheers!
>>2399
in addition to the bot timestamp validation per form you should also be recording the IP for each post and making users wait 20-30 seconds in between posts GLOBALLY (pretty standard for imageboards). One thing I'd like to suggest as well is for privacy as well is to store a HASH of the user's ip rather than it directly (IP addresses are considered personal data under euro laws). Using the most cryptographically secure hashing algorithm you can. Unfortunately IPV4 address space is so small it is trivial to break something like MD5 IP hash, there are even websites that do this already.

using something like SHA512 with a salt (make sure the salt isn't hardcoded in the source otherwise people will see it in the source code, maybe put in in the database?).

If you include both IPv4 and IPv6 addresses in the field, the address space should be large enough that its hard to break via something like rainbow tables.

This way board owners can look at a users post history and ban by IP while also not having direct access to their personal data in the way of IPs.
>>2400
also if you include the timestamp dont forget to encrypt it so the bot simply cant update it to now and submit
>>2401
Yah, salting and hashing IP addresses was actually something I had planned. Also it would be quite silly to put the salt in the source yes. I quite like the Serpent cipher so I might use that if I can. I think it's also possible to encrypt the entire database with fancy built in key rotation and what not as well.

There is some issue with not knowing IP addresses for moderators. For example you can't know about IP blocks or do reverse IP lookup. Loads of folks have /64 IPv6 addresses so you could trivially avoid a IP ban, and occasionally you might find a bad actor through reverse IP lookup.

>>2402
That makes sense.
>>2403
>I think it's also possible to encrypt the entire database with fancy built in key rotation and what not as well.
There are evidently some issues with this in that MariaDB doesn't encrypt all the logs etc. https://mariadb.com/kb/en/library/data-at-rest-encryption-overview/ when you encrypt the database. Also I couldn't find a convenient way to use Serpent, so what I'll probably do is use AES-256-CBC for small critical fields like passwords, usernames, and IP's along with MariaDB's at-rest encryption for any table whose contents aren't exposed via the API to everyone. This is still something I'll need to look into more though as it's not my area of expertise.
>>2404
You don't encrypt passwords. you hash them. There is no reason for password data to be reversible, you should never be able to read them. You should only be able to verify that they're correct, which is what hashing does.
>>2405
>You don't encrypt passwords. you hash them. There is no reason for password data to be reversible, you should never be able to read them. You should only be able to verify that they're correct, which is what hashing does.
Oh, my bad, that makes sense. As I mentioned this isn't my area of expertise, I'll probably try to track down a book to read on web application security.
>>2406
You probably want to take a look at Lynxchan's source code, or Vichan's. I'm pretty sure they're not encrypting the IP addresses either. Yeah they're PHP and not OCaml, but they'll show you what to do on the database side.
>>2407
>You probably want to take a look at Lynxchan's source code, or Vichan's. I'm pretty sure they're not encrypting the IP addresses either. Yeah they're PHP and not OCaml, but they'll show you what to do on the database side.
I don't really like accepting any design decision without fully understanding why it was made and thinking about how it lines up with my objectives, so I think I'll probably have to read a book as I mentioned. I might be able to take a peak at some other imageboard engines to see if there is anything clever I'm missing in terms of my implementation though, yah.
>>2407
>You probably want to take a look at Lynxchan's source code, or Vichan's.
Most currently written IB software is pretty shit TBH, the source code for Vichan and offshoots is godawful.

most imageboard technology is stuck in the last decade
lynxchan uses mysql and alot of ib implementations of SQL databases are using 1 big god table for everything (no normalization).

The whole point of writing a new imageboard from scratch and not just making yet another crappy custom fork of Vichan is to rethink these design decisions, both technical, ui/ux, and site mechanics.

>>2407
>I'm pretty sure they're not encrypting the IP addresses either
well, they really should, because on a website where anyone can create a board, if you post on someones board one time the mods/vols can dox you. I don't trust BOs with PII like IP addresses as they are just randos from the internet like you and me. Why should they get to see my IP and geolocate me? At most the website owner/admin should see it.

>>2405
this, you should be hashing passwords (although that is common knowledge in web dev)
>>2409
lynxchan uses mongo*
interesting.
I have another question, how to implement the hover preview of quotes without JS (CSS only)? like for example when i quote op: >>2315 but also, there is the issue of "back links" i.e. posts that have been replied to also have a hover quote of the posts replying to it.

This is probably hard to do as CSS only?
>>2412
Back-references are trivial to implement without JavaScript, the hover display is tricky though I effectively decided against it my reasoning being here: >>2326
I basically slept all day today, and when I wasn't sleeping I might as well have been. So very little was accomplished today. I made some improvements to the code parser and a combinator called "string_till", but that's about it. The solution I thought I had to parsing plain text did not work. Despite this the parser should now work for all "well formed" data where opening tags are always closed at the scope they were opened with.
>>2413
what about forward-references?
the thing that shots like No. xxxx >>xxxx >>xxxx

i.e. it shows what posts have linked to it?
>>2415
>what about forward-references?
Oh, that's what I thought you meant by back-references. Yah, those should be trivial to implement in only HTML/CSS, and typically don't work without JavaScript for whatever reason. I was kinda out of it yesterday sorry.
>>2416
yes i mean there are back references (in the post) linking back to a previous post. However to the right of the post they are being linked to, it also shows a similar link to the post it is being quoted in
>>2417
That makes sense.
Today was slightly more productive than yesterday. I wrote a serializer from the AST to S-expressions using Faraday, and made some clean ups to the parser. I tried a few more tactics to try to get plain text parsing to work the way I want which failed, so I've resorted to asking on a (different) forum for advice. For those curious, I want to pass around a list of scoped text formatting rules for a few reasons, and Angstrom's fixed-point combinator doesn't seem to allow any means of doing this. I've also decided to change the stack a bit, I think I'll be using React instead of Lwt, and probably a template engine, and a custom Faraday serializer instead of TyXML. I also am probably going to be adding UUCP, Logs, and Mime-Magic as dependencies.
I like OCaml, it's very cool.
>>2420
>I like OCaml, it's very cool.
It has its fair share of issues UTF8, threads, documentation, ecosystem fragmentation, the standard library, etc. but I enjoy it quite a lot too.
>>2419
>using React
i thought you said you were gonna go minimal javascript? if you're going to use an SPA framework at least use mithril.js since its like a minimalistic one
https://mithril.js.org/
its way faster and smaller

>>2421
>>2420
Ocaml suffers the issue of bad multicore/concurrent support, same as any 1990s language. Its still a pretty good language, looks like haskell for example was very ocaml inspired
>>2422
>i thought you said you were gonna go minimal javascript? if you're going to use an SPA framework at least use mithril.js since its like a minimalistic one
Yes, I will be using minimal JavaScript (I don't plan on using any libraries, frameworks, etc.), I was referring to OCaml React here: https://erratique.ch/software/react/doc/React I'll probably still need lwt a little to do concurrent IO in front of http/af but other than that I'll probably just be using Functional Reactive Programming, which should result in cleaner software and significantly reduced development time at a extremely slight performance cost.

>Ocaml suffers the issue of bad multicore/concurrent support, same as any 1990s language. Its still a pretty good language, looks like haskell for example was very ocaml inspired
Concurrency support is actually quite good through async and lwt, also OCaml is actually 6 years younger than Haskell, although they have a common ancestor in SML, which Haskell is a bit more distantly related too. Also I'm not sure it's that common to have multicore support as non-existent as OCaml's (it's literally implemented as a global lock), luckily it is in the works though: https://github.com/ocaml-multicore/ocaml-multicore/wiki It is a pretty great language though despite its many faults.
>>2423
so hows the project going
what about [code][/code] style tags like they have on lainchan
>>2424
>so hows the project going
It's fine, as always I only really work on it on Monday and Tuesday, making small minor changes on other days, but I'm hoping to have HTML generation complete this following week. I recently with the help of a individual from another forum found a simple way around a problem I was having so I should be able to easily write the remaining features for the parser and also the HTML serializer. These remaining features being allowing the closing of a tag in a scope other than the one it started in and interpreting opening/closing tags without their closing/opening counterpart as plain text.

>>2425
>what about [code][/code] style tags like they have on lainchan
I'm not certain what you mean. I'm not super interested in syntax highlighting in the code blocks if that's what you mean. You can't do it right without having the user input what language they are writing, and it adds loads of extra elements necessary to make it work. Not to mention it is not possible to completely implement this feature as languages spring up all the time. If you mean the bbcode style tags I've thought about this, and I'd implement this style tag for code blocks if ``` turned out to appear too much in source code, which shouldn't be the case.
>>2426
on lainchan they do have syntax highlighting.
>>2427
>on lainchan they do have syntax highlighting.
Yes, I know. It works as I've described except if you don't provide a valid language annotation from the list: https://github.com/highlightjs/highlight.js/tree/master/src/languages it just uses a default which does inconsistent highlighting based on some common features across languages. In practice pretty much no one uses the annotations on that site so most of the syntax highlighting is broken and it requires the costly HTML tags and CSS tags necessary for it to work at all. Even when the annotation is provided it tends to produce quite ugly results, see: https://www.lainchan.org/test/res/368.html#920

Oh course it's always possible to do better than existing solutions. The costs are just too high to me though, the necessary decrease in efficiency, and the persistent cost of adding new languages, along with the requirement of users to add this annotation to their source blocks which in practice they don't seem willing to do just makes this feature undesirable to me. This is ignoring the fact that this is actually a rather time consuming thing to implement in the first place because you have to write a parser for every language you want to highlight.
>>2428
yeah maybe syntax highlighting is dumb, youre making an imageboard not an IDE

any thoughts on what sort of font you want to use for UX though?

serif or monospace?
>>2429
>Yeah maybe syntax highlighting is dumb, you're making an imageboard not an IDE.
It wasn't the worst idea.

>Any thoughts on what sort of font you want to use for UX though? Serif or Monospace?
At the moment I'm just using whatever the default Sans-Serif and Monospace fonts are for their various purposes. Other than not wanting web-fonts I'm not real picky on this, I'd be perfectly fine accepting whatever users or folks in this thread like (so long as I don't think it's too bad).
One step forward two steps back. It really doesn't seem possible to express the oddities of my markup language in Angstrom, additionally it lacks Unicode support meaning I can't really filter ranges of Unicode characters easily. I'll rewrite things in Ulex and Menhir. Supporting only Latin-1 and "properly formed" markup is not acceptable.
>>2431
As far as something actually productive goes, here's the global rule set I'm considering. I'm posting it here reluctantly as I know this is the type of thing which could derail the thread or push it in the wrong direction. Regardless it seems significant enough to post here despite this:
0. Do not discuss the spectacle.
1. Do not discuss identity. (where identity is a group association based on some social construct)
2. Do not respond to perceived wreckers.
3. Always post content in relevant existing threads before making your own.
4. Always argue in good faith and avoid using personal attacks.
5. Always elaborate on your opinions, and expand on your thread topics.
6. Always spoiler any NSFW content and make sure it's relevant to the discussion.
7. Do not post anything illegal in the country of Romania. (or wherever the host is located)
>>2432
are these the rules for the chan you're making?

and does ocaml really not have utf8 support?
>>2433
>are these the rules for the chan you're making?
Well, they are what I'm considering at least, they are subject to change either based on my own thoughts or feedback from you guys.

>and does ocaml really not have utf8 support?
OCaml has the Uchar.t type for individual Unicode characters and allows for Unicode escaping in strings such as "\u{1234}" but strings default to Latin-1 and the char type is only Latin-1. There are also libraries that add Unicode support like UUCP and Camomile, but none of this matters if a relevant library you're using doesn't support these things such as is the case with Angstrom. Ulex supports Unicode by implementing it themselves I believe.
(90.64 KB 1280x720 mpv-shot0016.jpg)
>>2432
>0. Do not discuss the spectacle.
>>2435
I'd typically ask you to expand on this so that I can understand your critique better, but instead I'll expand on my rule so that it's clear what this would mean in practice. The idea is that this rule would limit conversation of self-improvement in authentic creative tasks or discussion of ideas independent of the spectacle. This is clearly a clash of cultures with what is typically discussed, but there are communities which exist for this type of thing, so it's not unprecedented. I don't plan on mandating detournment or anything like that. In any case, it seem nearly every community does allow discussion of the spectacle, so for those that don't like this rule there are many other places for them to go.
would limit conversation to**
(360.90 KB 512x512 NO FUN.png)
>>2432
you missed a rule
>>2438
I can add this if it becomes too much of a issue. ツ
>>2439
>>2431
I hate to do this but I'm actually going to be implementing the 80% solution for now. I'll make a note file of how I should rewrite things in the future but here's a short list of the features that will be missing from the markup language until I circle back: UTF-8 support (currently only Latin-1), closing tags at a scope other than the one they started at, support for link schemes other than ftp, http, https, and gopher, and a number of whitespace optimizations such as converting repeated line breaks into paragraphs when possible, deleting tags that only effect the foreground when there is only whitespace in them, and converting all consecutive non-line breaking whitespace into a single space.

I'll polish up my Angstrom parser and implement a Faraday HTML serializer today or tomorrow to as I said earlier finish HTML generation by this week.
>>2441
based
>>2441
To maximize the part of my life authentically lived under normal circumstances I don't allow myself to use the internet except for to find answers to specific questions, or to download books, and I don't engage with entertainment at all. Over the past several months due to some stress in my life I've been enforcing the internet part of my rules less stringently than I would like. I'll be continuing development but dramatically slowing my dev-log. It's unlikely that questions asked here will be answered in any sort of timely fashion. I hope you all understand. Just for the record, I don't think Asceticism is in its self good, it's just a tool I use.
>>2443
wait so are you discontinuing development? i was planning on using this... :(
>>2444
No, "I'll be continuing development but dramatically slowing my dev-log." In other words I'll be programming the same amount but talking about it less. Also keep in mind it's going to take a while before it's competitive with existing engines, you'd probably be better off using those for the foreseeable future.
>>2315
I'm late to the game here, why was >>>/tech/ shut down?
>>2446
they said there wasn't enough activity, so they combined the non-leftypol boards other than GET into one single board
Bumping, op come back
hey I haven't read the thread, but, how hard is it?
>>2449
OP is trying to make a new imageboard sowftware, If you want to make oneyourself, try using LynxChan
It's amusing to me that the formatting on this site managed to become even more broken, and that this somehow managed to effect posts already stored in the database; this thread is hideous now. Anyway I made a good bit more progress on this project but decided to abandon it. My justification was that while the left needs to establish new hegemonic institutions, of which social-media is probably the strongest modern example, these institutions are worthless unless you're willing to use the power these institutions hold to enforce your will, and I simply don't have the time to realistically do this. Another issue is that the internet probably reflecting modern cultural trends (as promoted by the bourgeoisie) is hedonist and obsessed with the spectacle (not to mention least common denominator politics like social democracy), these cultural traits would be very difficult to overcome, and require rules that would be especially costly to enforce. For now I'm going to be focusing on improving my skills, I've recently been studying GADTs, existential types, type-level programming, eta-reduction and all sorts of interesting things in addition to my standard mathematics studies, hopefully my improved skills will be useful to us in the future. In any case, I will likely stop posting on this forum or any other forum, and it's fair to assume that any posts from this point on anywhere on the internet are not me. I wish you all the best, good bye. >>2449 >hey I haven't read the thread, but, how hard is it? It's a pretty simple web application, not particularly difficult so long as you don't obsesses over the details (which I tend to do). I'd only ever made two static sites before this and I was still able to make CSS that was superior to any imageboard I was able to find, and the backend was very clean if incomplete despite my programming experience having nothing to do with the internet. >>2450 >OP is trying to make a new imageboard sowftware, If you want to make one yourself, try using LynxChan. Just to be clear using LynxChan isn't creating a new image-board engine, which was my intent.
>>2451 Sad to hear that, good luck on your next projects anon tho >Just to be clear using LynxChan isn't creating a new image-board engine, which was my intent. Oh yeah I know, I was just telling him to use it if he wants to make a new imageboard
feature request make threads look good in firefox's reader mode
>>2453 lol nvm just saw this wasn't happening anymore RIP

Delete
Report

no cookies?