A Flask-based web app that predicts whether a news article headline is REAL or FAKE from a given URL. It scrapes the page headline, detects and translates non‑English text to English, vectorizes with TF‑IDF, and classifies using Logistic Regression. Authenticated users have their analyzed URLs saved to MySQL.
dataset.csv
with columns news
and label
(0 = REAL, 1 = FAKE).TfidfVectorizer
+ LogisticRegression
./
and sees a single-page interface.<h1>
(BeautifulSoup).username
and full_name
into MySQL.headline
, result
(“REAL” or “FAKE”), language
, and whether it was saved./
(GET): Renders the UI./api/me
(GET): Returns current session user./api/signup
(POST): Creates user; starts a session./api/signin
(POST): Authenticates; starts a session./api/signout
(POST): Clears session./detect
(POST): Core inference endpoint.secret_key
.requests
to fetch the URL with a standard user-agent and timeout.BeautifulSoup
to extract the first <h1>
as the headline.langdetect
to detect language of the headline.deep-translator
(GoogleTranslator) to translate to English if not English.TfidfVectorizer
for text feature extraction.LogisticRegression(max_iter=1000)
for binary classification.dataset.csv
and kept in memory.users
table: stores username, full name, password hash.url_entries
table: stores username, full name, and submitted URL.generate_password_hash
.username
and full_name
.templates/index.html
: Single-page app with modals and fetch-based API calls.static/styles.css
: Modern glassmorphism styles.app.py
: Flask app, ML pipeline, routes, DB init, and handlers.templates/index.html
: Frontend page with forms, modals, and client JS.static/styles.css
: UI styles.dataset.csv
: Training data with news,label
.requirements.txt
: Python dependencies.venv/
: (local virtual environment, optional)python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
set FLASK_SECRET_KEY=your_very_secret_key
app.py
match your setup
app.py
.final_project
and required tables if they don’t exist.python app.py
http://127.0.0.1:4000
dataset.csv
is loaded and the model is trained./api/me
null
.
{ "username": "john", "full_name": "John Doe" }
/api/signup
(x-www-form-urlencoded or JSON)
username
, full_name
, password
{ ok: true, username, full_name }
or { error }
/api/signin
(x-www-form-urlencoded or JSON)
username
, password
{ ok: true, username, full_name }
or { error }
/api/signout
{ ok: true }
/detect
(x-www-form-urlencoded)
url
{
"headline": "Detected or translated headline text",
"result": "REAL",
"language": "en",
"saved": true
}
{ "error": "URL fetch failed: ..." }
{ "error": "No headline found" }
{ "error": "Model prediction failed: ..." }
curl -X POST http://127.0.0.1:4000/detect ^
-H "Content-Type: application/x-www-form-urlencoded" ^
--data-urlencode "url=https://example.com/news-article"
dataset.csv
with headers:
news
(string): headline or news textlabel
(int): 0 = REAL, 1 = FAKETfidfVectorizer
→ LogisticRegression(max_iter=1000)
."REAL"
for 0
, "FAKE"
for 1
.app.py
.final_project
(auto-created if missing).users(id, username UNIQUE, full_name, password_hash, created_at)
url_entries(id, username, full_name, url, created_at, INDEX(username))
/
) with:
/api/me
, /api/signup
, /api/signin
, /api/signout
, /detect
.dataset.csv
exists with columns news,label
. Missing/invalid data will be logged on startup.<h1>
element. Sites without <h1>
may fail.app.py
to match your environment.FLASK_SECRET_KEY
in production.dataset.csv
and restart the app to retrain the model.